0

I have an HPC cluster with 22 nodes, and one head node as a master running Rocks Cluster OS (which is based on CentOS).

The nodes and master communicate with private network (10.10.0.0/16). And we ssh to server with routed public network (192.168.xxx.xxx/24) and this network are not routed to the worker nodes.

Now our data has reach it's limit, we can't add anymore disk to the master.

Now we want to build a Lustre cluster consisting of 2 OSS and one MDS. My question is...

Do we have to connect the Lustre OSS, and MDS to the same network with the HPC nodes (10.10.0.0/16), so that the nodes can mount our new LustreFS as Lustre clients?

Or we can just mount the Lustre clients on the master node, and share the Lustre trough NFS for the HPC worker nodes?

We will have other Lustre client outside HPC environment, so we will configure the Lustre on 192.168.xxx.xxx/24.

Any suggestion?

U880D
  • 1,120
  • 10
  • 24

1 Answers1

1

Do we have to connect the Lustre OSS, and MDS to the same network with the HPC nodes (10.10.0.0/16), so that the nodes can mount our new LustreFS as Lustre clients?

Since Lustre is a parallel filesystem for high performance and cluster network is somehow private, yes.

Or we can just mount the Lustre clients on the master node, and share the Lustre trough NFS for the HPC worker nodes?

For a parallel filesystem, no, this is not something feasible.

U880D
  • 1,120
  • 10
  • 24
  • 1
    Note that you _can_ re-export Lustre with NFS to other nodes, but this should only be used for mounting it on non-Linux nodes (Windows, OS/X). Using NFS re-export limits the available bandwidth to what the NFS server (Lustre client) can provide, removing all parallel IO benefits, so should not be used for Linux clients. If you need to bridge between different networks (e.g. IB and TCP) it is possible to use LNet routers for this, which is a different topic of conversation. – LustreOne Apr 19 '18 at 18:43