研究处

威尔逊大厅
威尔逊大道371号
罗彻斯特 密歇根 48309-4486
(位置地图)
(248) 370-2762
(248) 370-4111
(电子邮件保护)
# Oakland研究

黑板上写着化学方程式和图表.

HPC (High Performance 计算 Cluster)

高性能计算集群:Matilda

计算
The 十大菠菜台子 (OU) central HPC Linux-based cluster (Matilda) is intended to support parallel, GPU, and other applications that are not suitable for individual computers.  Matilda HPC集群由大约2,200个核心组成.  All nodes are interconnected with 100Gbps InfiniBand networking.

Matilda HPC集群包括以下计算节点:

  • 40 standard compute nodes, each with 192 GB of RAM and 40 CPU Cores at 2.50兆赫.
  • 10 high throughout nodes, each with 192 GB of RAM and 8 CPU Cores at 3.80 GHz.
  • 4 large memory nodes, each with 768 GB RAM and 40 CPU Cores at 2.50兆赫.
  • 4个混合节点, each with capacity to include specialized accelerator cards or GPUs and 40 CPU Cores at 2.50兆赫.
  • 3 GPU nodes, each with 4 NVIDIA Tesla V100 16G GPUs with NVLink, 192 GB RAM and 48 CPU Cores at 2.10 GHz.

存储
The system includes 690 TB of high-speed scratch storage using a high-performance parallel file system connected via 100Gbps Infiniband to each compute node.

主目录, 项目空间, 和共享软件驻留在戴尔EMC Isilon H500上, 一体化备份解决方案的存储系统. Data is replicated to a Dell EMC Isilon A2000 located in a secondary data center with independent power and HVAC systems.  The Dell EMC Isilon A2000 can also provide an archival mechanism to Amazon Web Services.

Intra-networking
All Matilda HPC cluster nodes are interconnected with HDR100 Infiniband, 提供高达100gbps的带宽和sub 0.6美国铀浓缩公司延迟.

网际网路
The Matilda HPC cluster is connected to the 十大菠菜台子 campus network with 10 Gbps connectivity to provide access to storage systems and from researchers labs and workstations.

软件
The Matilda HPC cluster includes a comprehensive software suite of open source research software, 包括主要的软件编译器, 以及许多常见的特定研究应用.

数据中心设施
The Matilda cluster is housed within the North Foundation Hall data center. 这个设施配备了灭火系统, 一个备用发电机和环境控制.

基本资源分配
要求, all OU-affiliated researchers receive 50 GB of home directory storage and 10 TB of scratch storage1 在Matilda集群上. This allocation allows OU-affiliated researchers access to the Matilda cluster and to submit jobs as part of a PI project/group.

PIs are also provided with shared 项目空间 for research projects or group projects. These allocations are assigned to the PI and can be used by members of their group:

  • 计算时间2每年100万
  • GPU小时3: 5万/年
  • 共享项目/组存储:1tb
  • 共享项目/组草稿1 存储空间:10tb

计算时间和GPU时间是可以转换的, so researchers can use their allocation in whatever ways make best sense f或者是ir specific needs. 计费权重为GPU小时数的10倍, 这意味着100个GPU小时相当于1,000 CPU小时, 而100个CPU小时相当于10个GPU小时. Consequently, each researcher has an effective annual allocation of 1.500万小时可供使用. 在聚合中跟踪PI及其组的使用情况, 并且使用量在每个日历年开始时重置为零.

额外计算资源的费率
研究ers who need additional computational time beyond the annual base allocation can purchase additional resources. 当前费用(每两年修订一次)是:

  • 计算时间2: $0.每小时024分
  • GPU小时3: $0.24小时

Additional purchased computational resources are placed in a separate account that is accessible to the researcher and any other group members they choose. Unlike base allocation amounts (which are "use or lose" - meaning unused portions do not roll over from one year to the next), unused purchased resources will remain available until exhausted. 使用额外购买的小时数, a researcher or group member must specify the account to be used during job submission.

支持节点
研究ers who need hardware capacity beyond what is currently available 在Matilda集群上 can purchase additional nodes. UTS staff will add purchased nodes to the cluster and manage them together with the rest of the cluster. Buy-in users and their research groups will have priority access4 在他们购买的所有集群资源上. 它们还将获得额外的计算时间(CPU或GPU), as needed or desired) in the calendar year they purchase resources, 以购买时的有效费率为基础.

如需购买节点,请联系UTS (电子邮件保护) 讨论您的需求并获得报价. 确切的价格将取决于所选择的硬件, plus any incidentals that may be needed to connect the new hardware to the cluster.

额外存储费用
研究ers or groups who need additional storage beyond the annual base allocation can purchase additional space, 根据他们具体的存储需求. There are two base storage types: storage on the Matilda HPC cluster itself, 或存储在一个或多个OU数据中心, 但不能直接访问玛蒂尔达集群. 当前费用(每两年修订一次)是:

  • Matilda项目或主目录配额:每年每TB 260美元
  • Matilda刮刮空间配额:每TB每年72美元
  • 性能层:每TB每年170美元
  • 存档层:每TB每年90美元
  • 复制性能层:每年每TB 250美元
  • Replicated performance tier with deep archive: $260 per TB per year
  • 具有深度存档的存档层:每TB每年90美元

支持
The Matilda HPC cluster services are provided through a collaboration with 十大菠菜台子 研究 Office and University Technology Services.  更多信息,请访问 大学技术服务研究支持页面 或者是 研究计算和HPC文档 网站.  要请求访问,请填写 Matilda HPC集群访问请求 form (scroll down to "Matilda"; online form requires OU log in).


1Scratch storage is short-term storage used only for working files. 没有备份或镜像. Inactive files (determined by the last time they were accessed) are deleted after 45 days.

2计算小时是根据作业中使用的每个CPU核心来衡量的. A job running on 40 CPU cores for one hour would consume 40 compute hours.

3GPU小时 are measured per GPU requested, as typically only one job can be run on a GPU at a time. A job requesting 2 GPU resources and running for one hour would consume 2 GPU小时.

4Priority access means that users are guaranteed to be able to start a job on a purchased resource in less than four hours when they need the purchased resource for a research project. Priority access to purchased resources lasts for five years from the date of purchase 或者是 anticipated useful life of the hardware, 取较小者. 当购买者没有使用购买的资源时, it will be available to other cluster users for a maximum walltime of 4 hours per job.