Open Storage in the Enterprise With Gluster and Ceph

Dustin L. Black, RHCA, RHCA

Principal Cloud Success Architect

Red Hat Customer Experience & Engagement

November 2015

Abstract

Brief

A first-hand account of how Gluster and Ceph are being leveraged in multiple industries to simplify the data center and drive new technologies and architectures.

Full

As storage needs grow by leaps and bounds, enterprises are recognizing the value of open scale-out solutions. The benefits of commodity hardware and OSS once again open the doors to fast innovation with yet another abstracted layer in the data center. Join Dustin Black, Principal Technical Account Manager with Red Hat, for a first-hand account of how Gluster and Ceph are being leveraged in multiple industries to simplify the data center and drive new technologies and architectures. See how top players have re-imagined storage as an agile and nimble service versus the lumbering monolithic backend of the past. In this session, Dustin will provide technical overviews of both Gluster and Ceph. He will then dive deep into use cases, highlighting enterprise challenges and describing how our open storage technologies and a solid support partnership have been leveraged for innovative solutions.

Dustin L. Black, RHCA

dustin

dustin@redhat.com

@dustinlblack

linkedin.com/in/dustinblack

people.redhat.com/dblack

Let’s Talk Distributed Storage

Decentralize and Limit Failure Points
Scale with Commodity Hardware and Familiar Operating Environments
Reduce Dependence on Specialized Technologies and Skills

Gluster

Clustered Scale-out General Purpose Storage Platform
Fundamentally File-Based & POSIX End-to-End
- Familiar Filesystems Underneath (EXT4, XFS, BTRFS)
- Familiar Client Access (NFS, Samba, Fuse)
No Metadata Server
Standards-Based – Clients, Applications, Networks
Modular Architecture for Scale and Functionality

Gluster Stack

gluster stack

Gluster Architecture

gluster architecture

Ceph

Massively scalable, software-defined storage system
Commodity hardware with no single point of failure
Self-healing and Self-managing
- Rack and data center aware
- Automatic distribution of replicas
Block, Object, File
- Data stored on common backend filesystems (EXT4, XFS, etc.)
- Fundamentally distributed as objects via RADOS
- Client access via RBD, RADOS Gateway, and Ceph Filesystem

Ceph Stack

ceph stack

Ceph Architecture

ceph architecture

Gluster Use Case: Client File Services in AWS

Challenge

Isolated customers for web portal SaaS
POSIX filesystem requirement
Efficient elastic provisioning of EBS
- 2PB provisioned; only 20% utilized
Two backup tiers
- Self-service & Disaster recovery
Flexibility to scale up and out on-demand

Implementation

Lorem ipsum

Diagram

aws use case diagram

Gluster Use Case: Media Object Storage

Challenge

Media file storage for customer-facing app
Drop-in replacement for legacy object backend
1PB plus 1TB/day growth rate
Minimal resistance to increasing scale
Multi-protocol capable for future services
Fast transactions for fingerprinting and transcoding

Implementation

12 Dell R710 nodes + MD1000/1200 DAS
- Growth of 6 → 10 → 12 nodes
~1PB in total after RAID 6
GlusterFS Swift interface from OpenStack
Built-in file+object simultaneous access
Multi-GBit network with segregated backend

Diagram

media use case diagram

Gluster Use Case: Self-Service with Chargeback

Challenge

Add file storage provisioning to existing self-service virtualization environment
- Automate the administrative tasks
Multi-tenancy
- Subdivide and limit usage by corporate divisions and departments
- Allow for over-provisioning
- Create a charge-back model
Simple and transparent scaling

Implementation

Dell R510 nodes with local disk
~30TB per node as one XFS filesystem
Bricks are subdirectories of the parent filesystem
- Volumes are therefore naturally over-provisioned
Quotas placed on volumes to limit usage and provide for accounting and charge-back
Only 4 gluster commands needed to allocate and limit a new volume; Easily automated

Diagram

self service use case diagram

Gluster Use Case: NoSQL Backend with SLA-Bound Geo-Replication

Challenge

Replace legacy database key/blob architecture
Divide and conquer
- NoSQL layer for key/pointer
- Scalable storage layer for blob payload
Active/Active sites with 30-minute replication SLA
Performance tuned for small-file WORM patterns

Implementation

HP DL170e nodes with local disk
~4TB per node
Cassandra replicated NoSQL layer for key/pointer
GlusterFS parallel geo-replication for data payload site copy exceeding SLA standards
Worked with Red Hat Engineering to modify application data patterns for better small-file performance

Diagram

nosql use case diagram

Ceph Use Case: Storage & Compute Consolidation for Scientific Research

Challenge

Scale with storage needs
- Eliminate need to move data between backends
- Keep pace with exponential demand
Reduce administrative overhead; Spend more time on the science
Control and predict costs
- Scale on demand
- Simple chargeback model
Efficient resource consumption

Implementation

Dell PowerEdge R720 Servers
OpenStack + Ceph
- HPC and Storage on the same commodity hardware
- Simple scaling, portability, and tracking for chargeback and expansion
400TB virtual storage pool
- Ample unified storage on a flexible platform reduces administrative overhead

Diagram

scientific use case diagram

Ceph Use Case: Multi-Petabyte Restful Object Store

Challenge

Object-based storage for thousands of cloud service customers
Seamlessly serve large media & backup files as well smaller payloads
Quick time-to-market and pain-free scalability
Highly cost-efficient with minimal proprietary reliance
Standards-based for simplified hybrid cloud deployments

Implementation

Modular server-rack-row "pod" system
- 6x Dell PowerEdge R515 servers per rack
- 10x 3TB disks per server; Total 216TB raw per rack
- 10x racks per row; Total 2.1PB raw per row
  - 700TB triple-replicated customer objects
- Leaf-Spine mesh network for scale-out without bottleneck
Ceph with RADOS Gateway
- S3 & Swift access via RESTful APIs
- Tiered storage pools for metadata, objects, and logs
Optimized Chef recipes for fast modular scaling

Questions?

people.redhat.com/dblack

Do it!

Build a test environment in VMs in just minutes!
Get the bits:
- Fedora 23 has GlusterFS and Ceph packages natively
- RHSS 3.1 ISO available on the Red Hat Portal
Go upstream: gluster.org / ceph.com

Thank You!

@RedHatOpen @Gluster @Ceph @dustinlblack #RedHat