A Quick Start Guide to Installing Ceph

Written by SPK Blog Post

Published on October 28, 2013

Categories: Data Engineering | Infrastructure | Integration and Workflow | Software Development & Release Management

This seems to be the age of “Big Data”. Every sector seems to have a need for it — from biotechs doing genome sequencing, to financial providers mining market data. For many, the ability to store massive amounts of structured or unstructured data is the key to success, and accessing that data quickly is just as important. Traditionally, centralized storage was the go-to solution. You invested in an expensive Storage Area Network, and in return, it provided excellent performance and scalability.

From a small business perspective, a traditional SAN presents several challenges:

Cost – A SAN is typically composed of a storage controller, some disk shelves, and a separate fiber channel network. Then you have some not-so obvious costs like SAN software & licensing (management software, replication software), and HBA costs.
Administrative overhead – Ethernet switching and routing is ubiquitous. Fiber channel on the other hand, requires experience with FC switches, zoning, multipathing, etc. You’d be best suited to hiring a dedicated storage administrator.
Scaling with respect to cost – You invest in the SAN equipment, increase your compute capacity, purchase more SAN equipment, rinse, repeat. As you grow your SAN, how do you plan for upgrades? How do you justify eventual forklifts?

Enter the new era – distributed filesystems. Ok, perhaps this isn’t so new. Google developed their own in house proprietary filesystem years ago, called BigFiles. It was designed to run on commodity servers, be resilient (since it runs on commodity servers), and perform well. No HBAs, no separate fiber infrastructure, no costly SAN. The idea is that as one unit of computing is added, you get an additional spindle or two of IO – so performance is linear as you scale.

Several open source distributed filesystems have gained in popularity recently. One of which I’ll be discussing today is Ceph. Developed as a drop-in replacement for Hadoop’s distributed filesystem, I’ll show you how you can quickly deploy it to serve as your primary storage. And even if you’re not sequencing any genomes, you likely either have; a VMware cluster, an Exchange installation, or users who simply like to store lots of files. Any of these situations can reap the benefits.

Click here to download our Quick Start Guide to Installing Ceph.

← Previous: Declutter your multi-computer desktop with 'Mouse without Borders' Next: Agile Development in Regulated Environments - Part 1: Yes, it can work →

Latest White Papers

ITSM Tool Integration Guide: Connecting Jira, ServiceNow, and Freshservice

While using a singular ITSM tool may be simpler, many organizations utilize multiple for their unique features. This often results in Jira Service Management, ServiceNow, and Freshservice working in tandem. Integrating these tools can be harder than it appears, but...

Subscribe to our blog

Stay up to date with the latest Engineering Technology tips and news.

Related Resources

History-preserving, zero disruption migration from IBM RTC to PTC Codebeamer

Jul 17, 2026

Zero disruption, zero data loss, and full traceability.  Lynx is a global provider of safety-critical avionics software and hardware solutions. As the firm scaled its engineering operations, the organization required a modernized ALM platform to handle the growing...

From Reactive to Predictive: How AI and Integration Transform Engineering Efficiency

Jul 17, 2026

The modern engineering landscape is defined by a relentless push for speed and a non-negotiable requirement for safety. For engineering and product leaders in regulated industries, the pressure to deliver complex mechatronics products has never been higher. ...

The L1–L2–L3 Escalation Problem: How Integration Fixes Tiered Support

Jul 17, 2026

When dealing with IT issues, there are a few standard support levels. L0, while not always recognized, is our self-service level, in which a user fixes the issue alone. L1 handles the inbound noise, password resets, and easy-to-fix tickets. Then L2 picks up what L1...

Other Software Experience

Resources

Topics

Latest Blog Posts

Most Popular Resources

A Quick Start Guide to Installing Ceph

Latest White Papers

ITSM Tool Integration Guide: Connecting Jira, ServiceNow, and Freshservice

Subscribe to our blog

Thanks for subscribing! You'll hear from us soon!

Related Resources

History-preserving, zero disruption migration from IBM RTC to PTC Codebeamer

From Reactive to Predictive: How AI and Integration Transform Engineering Efficiency

The L1–L2–L3 Escalation Problem: How Integration Fixes Tiered Support

About

All Content

A Comprehensive Guide to Understanding GitLab Runners and Pipelines

Contact