Tip:
Highlight text to annotate it
X
1.8 Million IOPS.
That seems like a number reserved for high end proprietary storage systems
out of reach for all but the wealthiest of IT budgets.
Achieving that level of performance using commercially available
off he shelf components, in a word – incredible!
It took the vision, experience, and cooperation of
industry leaders coming together to design a system
that not only achieve the results desired but far exceeded them.
This project began with Orange Silicon Valley,
a fully owned innovation and
research subsidiary of the telecom giant, Orange.
At Orange Silicon Valley we are always striving to bring
disruptive innovation that can address the IT needs
of Orange and our clients.
We are always interested in finding ways of doing
more with less, and maximizing asset utilization.
We believe that based on open standards and open
architecture it is possible to deliver a
high end appliance type of performance
at a much lower TCO (total cost of ownership).
At the world races twoards Exascale
We envision that "extreme compute" can become
more affordable for enterprise IT.
The challenge this time – design a high performance,
linearly scalable appliance-like system
that can handle an intensive I/O bound
online transaction processing workload.
The system will be considered Mission Critical ++,
live, customer facing
with zero downtime SLA (service level agreement).
Easy enough.
There is only one additional request.
Do it using commercially available
off the shelf components AND,
be able to demonstrate a significant
reduction of total cost of ownership.
The Orange team reached out to their
trusted technology advisors at Hyve Solutions to help them
identify the right technical direction for the project.
Hyve Solutions, a division of SYNNEX Corporation,
is a leader in purpose-built data center server
and storage solutions, designed to meet
their customers' specific workload requirements.
Orange Silicon Valley came to use with a
clear but challenging set of requirements.
Orange had thought through their technical and business requirements in great detail.
My engineering team went to work and got creative
so we can provide a higher level of collaboration
and flexibility to solve their tough technical problems and exceed their goals.
The teams worked together to develop a plan
for the type of system and performance
they were looking to achieve.
The next step was finding the right mix of off the self components
to get the proof of concept system built.
This wasn't and easy task.
There were many important decisions to make.
Picking the right partner for use was
as important as picking the right components.
We needed partners that were at the forefront or their respective technologies,
but also had the resources to support us both evaluation and production phases.
The team chose a Sandy Bridge based platform
with 6 of the PCI Express 3.0 slots for its
exceptional balance of high data rates and clock speeds.
The system supports a quad-channel
memory architecture, as well as high speed DRAM
at 1,600 mega transactions per second.
For the RAID controller, the chose was made to go with LSI.
After a series of discussions with LSI
we came to the conclusion that they were the logical choice
for our storage controller needs.
The range of the LSI product portfolio,
the field-tested reliability and maturity of their designs
and their organizational depth
made them the odds on favorite.
LSI strives to deliver industry-leading storage technologies,
accelerate applications, and improve the end user experience.
We are proud to collaborate with Hyve and Kingston
in order to help Orange Silicon Valley deliver against a very aggressive goal.
Using our years of technology leadership
and storage expertise to help the team reach
a milestone of over 1.8 million I/Os per second
is a powerful achievement.
We believe it is consistent with our efforts
to improve the end user's overall compute experience
and help Orange deliver more information to more users faster.
The last two components the team needed to source
were memory and SSDs (solid state drives).
We needed a large memory and solid state dive footprint
for this design. And with Kingston's solid reputation
for reliability in the enterprise space
they were an easy contender.
On the SSD side their E100 showed promising performance
and they offered great engineering support.
After initial testing their drives exceeded exceptions
and we moved forward with Kingston as one of our partners.
When the Orange/Hyve team approached us to participate
in this project we immediately recognized that this
was an opportunity to be a part of an incredible milestone
in our industry. Our job was twofold:
On the DRAM side, to make sure the memory configuration
was optimized for the best performance.
And on the flash side, that we selected the right class of SSD
that delivered not only exceptional performance,
but also achieved the exacting endurance requirement
of the intended workload.
Now that all the partners were in place, the team began testing
the system against the goal set for the project.
With system configuration locked down, each partner set up
the hardware in their lab environments to run the benchmarks simultaneously.
Once we had the hardware set up in our labs we began tuning
the environment for optimal performance.
We used the FI/O benchmark under CentOS 6.3 to benchmark the
24 drive subsystem.
We attached the 24 SSD's to 3 LSI MegaRAID 9265-8i RAID controllers,
with an eight drive RAID 0 configuration on each controller.
This allowed us to take advantage of the aggregate performance
that can be achieved by distributing the workload
across the PCI Express channels.
To further improve performance we used LSI's "Fastpath" performance option
that unlocks additional IOPS by changing the characteristics
of the firmware to optimize for SSDs.
Until recently one of the roadblocks to higher storage performance
with SSDs has been that RAID controllers were engineered for
mechanical hard drives. SSDs allow for such high performance
that their real potential was being held back.
We now have RAID controllers available to us that
are specifically designed for SSDs.
As this project demonstrates, we are now able to
scale SSD performance to levels that weren’t possible
as recent as even a year ago.
The initial results were very promising.
We measured close to our goal of one million IOPS on our first test run
and tuned the system up until we got close to 1.8 million IOPS consistently.
We started synthetic I/O benchmarks in early 2012 that emulated
real world OLTP behavior and we crossed the Million IOPS scalability barrier
reminiscent of the sonic boom associated with achieving Mach 1.
With a 24-drive bay fully populated with Kingston drives
on RAID 0 powered by 3 LSI cards we exceeded 1.8 Million I/OPs.
We are very close to hitting a figurative MACH 2.
We are working on using this platform for OLTP use cases
that are similar to our mission critical extreme I/O demanding applications.
We might need to pack a few more TERAFLOPS in the box
to be able to fully utilize the near 2 Million IOPS potential of the solution.
We’ll find out if this is the case as we keep making progress, so stay tuned!
In the end, not only was the project goal achieved,
but it exceeded all the partners’ expectations.
Our vision was to design a high performance,
linearly scalable system using COTS components with significant reduction of TCO.
Our design efforts are targeted for mission critical, live,
customer facing, zero downtime SLA’s and associated with
intensive I/O bound Online Transaction Processing Workloads.
With our colleagues at Orange Silicon Valley we were able to able to
work with design partners towards the goal of
achieving extreme compute at commodity cost.
This is very exciting for us!
Now we have made this an open architecture that any IT organization
can build for their extreme I/O needs.
For our database consolidation platforms we expect
Carrier Grade performance and reliability.
Achieving that with a significantly lower TCO becomes a key game changer for IT.
With this proof of concept and its ability to deliver performance,
reliability and scale for high I/O bound environments,
and all at a reasonable cost, the only question that remains is:
How can you use the 1.8 Million IOPS to overcome your I/O challenges?
“It’s kind of fun to do the impossible” - Walt Disney