In my last ECM post, I provided an introduction to the discipline of Enterprise Content Management. Many developers confuse Web Content Management (WCM) and Enterprise Content Management (ECM). When they are told about a content repository, they think of the tool they use to build web sites instead of the tool they use to manage the general content within their organization. This post explains the difference between these problem domains, and provides approaches for getting your enterprise content onto the web. As I explain in that previous post, this is a specific case of the general trade-off between building a solution around a general ECM repository or deploying a content tool specially designed for the problems you are facing.
This table summarizes the major differences:
|Content Types||All||HTML, XML, and media|
|Custom Content Models Needed||Usually||Not usually|
|Authoring Environment||Integration with tools of choice||Done as text in the browser|
|Web Presentation Tier||Integration through services||Built-in|
|Workflow||Customizable||Basic review and approve|
|Versioning||All content types||Text content types|
|Network Architecture||Behind the firewall||DMZ or public Internet|
ECM systems are generalists—they are prepared to handle any type of content and can be customized for any situation. As always, this comes with a trade-off; when building a business solution, ECM systems need to be configured to handle the specifics of the content types involved in that scenario. This process commonly involves content modeling, which is defining the metadata and forms needed to classify the different types of content in the system. It also frequently involves setting up transformations, workflow, and user interfaces optimized for the specific content types. It will probably be necessary to integrate with other tools for creating, transforming, or publishing the content. An ECM repository does not dictate what content or tools should be used, and the result of this work is a system customized to your specific business.
A repository focused on a specific problem domain, such as WCM, will already have most of the capabilities necessary for meeting the needs of that domain. Web Content Management is a special case of ECM where you know that the majority of the content will be text-based and highly structured as XML or HTML. The rest of the content will be probably be media such as images, audio, and video which have already been prepared for publication. WCM systems can safely assume that the content will be organized as web pages, and that there will likely need to be a review and approve process for those pages. WCM systems deliver the content on an Internet-scale, so WCM is optimized for serving the content not for authoring the content. They usually assume that content authors are comfortable working in a web browser and understand the basic architecture of the web.
WCM only requires a subset of the content services provided by a full ECM system, but it has special capabilities in key areas such as a scalable web interface. A WCM system will be quicker to setup than an ECM system. But a WCM system will struggle with tasks such as versioning the creation of PDF whitepapers, or legal record archiving, or allowing the marketing department to design product catalogs using Adobe products. These are cases that require communication between what is called the system of record (ECM) and the system of engagement (WCM). Also, the content producers benefit from the version history, workflow, and metadata a repository provides, but they do not want to spend all day in a web browser text box. They have better tools like an Office Suite, InDesign, or PhotoShop. Trying to force them to work in the WCM tool often results in ineffective and frustrated content creators.
Building an ECM + WCM Solution
If your web site deals with complex content types, or interacts with more of the business than the web team, then you need a solution for getting your enterprise content onto the web. There are three general approaches for architecting these solutions:
- Adapt your WCM system to meet the ECM use case. This is the most commonly attempted and the least successful approach. WCM systems are not meant to meet ECM needs, and the adaptation usually results in poor solutions and broken WCM systems. If you have well defined and limited ECM needs, this can be successful. But usually a request to add an ECM capability to a WCM system is the start of a series of increasingly complex requests.
- Adapt your ECM system to meet the WCM use case. This the least common approach but can be very successful. However it can also be expensive and time consuming, as all the desired WCM features need to be added to the ECM repository. Most ECM vendors have solutions that make this easier or have partners who can get this done. Projects to adapt an ECM repository for a WCM use case need to be managed carefully as mismatched expectations between the solution provider and the non-technical content creators are very common and lead to intense feature creep.
- Integrate between the ECM system and the WCM system. There are two general approaches to an integration: publish from the ECM system to the WCM system, or query from the WCM system into the ECM system. Many solutions will use a combination of the two approaches. An integration can often be the quickest and best solution as it combines two best-of-breed platforms, but it requires understanding both platforms and can be harder to maintain. Functionality such as an end-to-end workflow can be hard across two different systems.
Let's look at option 2 and 3 in more detail. I work for Alfresco. Since it is open source, it is easy for you to play with to get a sense of what I am describing. Alfresco is an ECM system, not a WCM system. Instead of focusing on the web presentation tier, Alfresco provides a set of web content services to make it easy to get content from the authoring tier to the presentation tier. These services include:
- Publication of content to the filesystem, where it can be picked up by a web interface as a static file.
- Publication of content through social channels defined in XML.
- Content access through file server views such as CIFS, NFS, FTP and WebDAV.
- Manipulation and query of content with CMIS (Content Management Interoperability Services).
- Manipulation and query of content through the Alfresco REST API.
- Manipulation and query of content through custom REST API endpoints using webscripts.
- A sample Java Spring WCM interface called Alfresco's Web Quick Start.
If you decide to build your own WCM interface on top of the Alfresco platform (option 2), you can start with Web Quick Start and customize it to meet your needs. Web Quick Start has a functional but bare-bones WCM interface which serves as an example for querying the APIs and displaying the content, as well as editing content in-context in the presentation tier. In order to use it as the basis of a solution, you would customize the Web Quick Start presentation tier and implement additional needed functionality with Java Spring. Another approach is to use a WCM presentation tier provided by an Alfresco partner such as Crafter Software. This provides all the features you would expect of a WCM solution, while also having the flexibility of an ECM repository for complex workflows and diverse content types.
If you have an existing WCM system and you want to add ECM capabilities to it, or if you want to leverage a best-of-bread WCM presentation tier, then you need to integrate the two systems (option 3). Some object to this approach because it requires the organization to master two systems, but I don't find that to be a problem because there is a clear separation of responsibilities.
The web team will use the WCM tool to handle the site templates and manage the structured HMTL content. Most of that content does not need to be kept in the ECM system of record, and the rest of the organization should not have access to modify it. Static assets can be prepared with the help of the ECM tool and published to the static asset file location. On the other hand, the team that authors the media and PDFs will use the ECM system to do their work, and upon completing the publication workflow those assets will either be published or be marked available for the WCM system to query. The presentation tier will display the content in the appropriate location based on the design of the site. If the organization has a requirement for keeping a history of public pages, that can be setup as a scheduled job for the ECM system to snapshot public pages and store them. This is more robust than trying to reassemble dynamically generated pages as tools change over time.
Our web team used this approach on alfresco.com. It can be done from any front-end that has an Alfresco integration or supports CMIS. Examples I have seen include Liferay, Drupal, Joomla, and Django. Of course the Alfresco content services can be used to build your own integration with any other system you would like.
Hopefully next time you are asked to get your enterprise content onto the web, this background will guide your thinking. The same principles can be applied to any decision between a solution specific content repository and a general enterprise content management repository. They each have their place, and it takes careful consideration to select the best approach in your particular situation.