During the 1990s‚ it was widely recognized that a higher-level structure for building business applications was needed. At that time‚ any significant development project was typically built from the“metal up‚” using low-level languages such as C++ and Java. The project usually also had to define its own runtime environment and build its own services. The one exception to this was an RDBMS. Most frameworks of the time—such as J2EE or .NET—provided some packaging of services‚ but were so complex and slow that they didn’t provide a clear benefit.

The fundamental problem was that for many business cases‚ the amount of code to host the application dwarfed the application code itself. Clearly‚ there was a critical need for a standard platform that would provide the services in a nicely packaged environment‚ complete with documentation and development tools. A number of entrants tried to succeed in this space‚ but most of them faded away. Two notable successes still around today are Salesforce.com and Ruby on Rails.

Today‚ with the growth of IoT data and artificial intelligence-based analytics‚ the focus in many businesses is on more complex data processing and on“big data” volumes and workflows. To address this new set of requirements‚ we must implement new techniques for data acquisition‚ processing‚ storage and analysis. Naturally‚ there are a new batch of so-called platform contenders in the market attempting to meet this need. However‚ in their attempt to focus on“big data‚” many of these platforms have forgotten the lessons from the 1990s.

There is more to big data than just being able to store it. One must be able to extract value from data in natural and efficient ways. It should be easy to express business logic on the data. These requirements have been adequately addressed in the past for traditional RDBMS cases‚ but too many providers are starting over again with big data processing techniques requiring system or structured programming for what should be simple logic.

A comprehensive set of platform requirements for today’s applications

As we’ve learned from the successful platforms of the 1990s‚ the requirements of a robust platform are:

  • A uniform language and consistent APIs
  • An integrated object mapping (ORM) layer
  • An effective tool set for defining data and building user interfaces (UIs)
  • Natural expression of business logic
  • Easy testing and deployment of code
  • Complete and up-to-date documentation

Today‚ for businesses with modern data requirements‚ we must now build on those successful advancements to add:

  • Heterogeneous data storage
  • Modern data processing and analysis tools
  • Large-scale horizontal scaling
  • Effective logging and management
  • Data aggregation from enterprise and extraprise/sensor systems
  • A common platform interchange model for developers and data scientists
  • Access control and security

These new requirements are in addition to‚ not instead of‚ the set of existing traditional platform requirements. Offerings that focus only on“big data‚” and require the developers to work with a mishmash of languages and technologies‚ or without consistently good documentation or comprehensive tools‚ do not meet the true need and are destined to fade away just like most of the platform offerings of the 1990s.

Let’s take a look at some of the existing attempts to address next-generation platform requirements‚ and see how they stack up.

Salesforce.com: Primarily RDBMS support

Salesforce.com is probably the most effective and widely-used true platform for simple business applications. It provides all of the traditional requirements of a platform‚ including documentation and tools. Salesforce.com has been very successful at accommodating applications related to‚ but also beyond‚ their traditional salesforce automation (SFA) core.

However‚ the Salesforce platform is a good example of what was done well in the past‚ not what is required today. Major limitations include the inadequate range of its user interface and support for storage only in relational databases. The old-fashioned single-database model and lack of non-trivial data analysis makes it unsuitable for today’s big-data‚ AI‚ and IoT needs.

Ruby on Rails: Insufficient for big data

The Ruby language and the Rails server on top of it grew explosively from their release in 2005 because the two together provide many aspects of a true platform. Ruby on Rails uses a single‚ high-level programming language and comes with a built-in database. It is easy to deploy‚ and while the documentation and APIs (“gems”) are uneven‚ the large community of existing developers provide ready and easily available answers to most common questions.

Unfortunately‚ the Ruby on Rails platform was never intended to be high-volume and historically has not scaled well. It has also not been used much for big data processing flows‚ where different techniques are needed. The Ruby language is probably capable of being used for these tasks‚ but the platform has not grown to support these areas.

J2EE and .NET: No modern tools

The two most ambitious efforts in the previous generation of platform technologies were the Java Platform Enterprise Edition application servers and .NET from Microsoft.

Java EE –formerly called J2EE—is still used in some legacy systems‚ but is no longer being widely adopted. It collapsed of its own weight because the services it provided never justified its complexity. It also suffered from the traditional open source woes of inconsistent documentation and fragmentary tools.

The .NET platform has had more success because of the high-quality documentation and tools. Microsoft has always been effective at supporting its developers and they brought this focus to .NET as well. However‚ .NET doesn’t really provide any modern facilities and remains associated with Windows Server‚ which has not developed into an effective server environment.

Hadoop: A collection of un-integrated tools

Hadoop was originally a file system and batch processing technology (map/reduce on HDFS)‚ but has evolved to become a collection of open-source projects focusing on big-data storage and processing workflows. In addition to map/reduce‚ the Hadoop community has introduced tools like Storm and Cassandra. Storm provides stream processing capabilities while Cassandra has joined HDFS to provide more data storage options.

These projects are all excellent pieces of software and quite effective. Unfortunately‚ a platform that is just a collection of disparate technology projects is no more than the sum of its parts. For organizations building next-generation business applications‚ this means slow and complicated development. Again‚ the cost to stitch these disparate tools together to then host an application dwarfs the application itself.

C3.ai: A complete platform for Big Data‚ AI‚ and IoT applications at scale

We recognize that a useful platform for today’s business applications meets all the traditional API tool and documentation platform requirements as well as the new big data and complex analysis requirements. This is what we have created and proven with the C3 AI Platform running at many customers in production environments.

The C3 AI Platform is built on modern technologies (including many of those now under the Hadoop umbrella)‚ but provides an actual platform on top of those technologies. This is a vital distinction.

For example‚ multiple database technologies are necessary for industry applications:

  • A RDBMS for structured data with complex query requirements
  • NoSQL for data with vast volume requirements
  • A column-oriented database for data warehouse needs
  • A distributed file system

A true platform provides all these facilities‚ but in a unified model. The aspects of this unification include:

  • A single‚ consistent set of APIs
  • An ORM layer that works across and between multiple storage technologies
  • Consistent and high-quality documentation
  • All processing techniques work across all data
  • Tools that allow building UIs quickly
  • Integrated logging and monitoring across all pieces of infrastructure

Crucially‚ C3.ai has focused on AI for complex analysis. Not only should it be possible to store the massive amounts of data required for that analysis‚ one should be able to easily visualize it. Even more important‚ it should be easy to write business logic that runs efficiently and on data as it arrives‚ rather than requiring batch programming scripts for each piece of logic. The C3 AI Platform provides all these capabilities. A typical application on the C3 AI Platform requires a tenth of the code.

When you consider platforms on which to build modern business applications‚ particularly if there is a need to support big data or modern data analysis tools‚ make sure that you choose a platform that provides the right capabilities. We learned over the last two decades a necessary set of features are required. A collection of open source projects is not a true platform‚ but only a set of technologies that need to be integrated by developers. A true platform provides all of the needed capabilities‚ as a unified‚ integrated whole. Don’t settle for just something new; make sure your chosen platform provides all you need and helps your developers be productive and your apps be effective.

John Coker is a Senior Architect in the Platform team at C3.ai. He attended University of California Berkeley and is enjoying working on his fourth start-up.“I keep finding myself involved in solving hard problems‚ but they’re the ones that are more interesting and provide real value.”