Data Storage Digest

Do-It-Yourself Windows File Recovery Software: A Comparison

results »

LinkedIn Using Pinot to Store, Process and Analyze Data

LinkedIn is as fast-growing web portal that gives professionals from every industry and all around the globe the chance to interact with one another, form business networks and even cultivated new industry partnerships that they might not otherwise have access to. With this in mind, it's easy to see how the data storage and processing needs of LinkedIn's servers are through the roof. In order to get a better handle on the amount of data processing needed to sustain operations in the long-term, the development team with LinkedIn has recently developed the Pinot system; a brand new, real-time data analytics protocol that is designed to provide a data storage architecture that can support the entirety of LinkedIn's data storage, processing and analytics needs.

According to a statement released via LinkedIn's official blog, their service has "a lot of depth and each dimension requires special treatment." The blog post continues on to state: "We needed to build custom compression techniques to fit every dimension, in order to get optimal scan speed tradeoff versus memory consumed. For example, each one of our members can have hundreds of skills and representing them per event is difficult. Similarly, groups that members belong to and companies they follow are some of the dimensions difficult to represent per event. We built Pinot with this difficult to index data in mind, but will save the details of the compression techniques for future posts."

One of the primary goals of LinkedIn's new Pinot system is to completely change the way the LinkedIn servers handle data processing. Under the new system, LinkedIn is able to easily introduce system improvements, add brand new site features and even provide an increased level of scalability to their users. While Pinot is also tasked with handling the current functionality of LinkedIn, including certain analytical features that their users have become accustom to, those who already rely on LinkedIn to manage their own analytics will not even notice the change.

Neppali Naga, who was charged with leading a team of engineers in the building and development of a consolidated data system on behalf of LinkedIn, commented on the difficulty of creating such a system. He was quoted as saying: "There was not one proper solution that would be leverageable across the company."

This kind of advanced engineering team was new territory for LinkedIn, which used to rely on departmental groups and smaller, separate data storage and processing teams. While this was fine during the site's infancy stages, there was no choice but to expand as LinkedIn's popularity grew.

LinkedIn was officially founded in December of 2002, though it didn't see its public launch until May of 2003. Since then, the company has helped over 300 million different users secure new job opportunities, meet new business partners and solidify industry contracts. Quarterbacked by CEO Jeff Weiner and founded by a small group of individuals, including Reid Hoffman, Konstantin Guericke, Jean-Luc Vaillant, Allen Blue and Eric Ly, LinkedIn is based out of the United States with their headquarters in Santa Monica, CA.

Comments

No comments yet. Sign in to add the first!