Documentum Big Data import/export

14-09-2012
Informed Group notebook

Deze klanten gingen je al voor

I’ve been away from this blog for a while, busy on projects for clients.
I learned something on one of these projects that I thought was worth sharing.

In a nutshell

Importing BIG files to or exporting them from Documentum is a challenge, but you can get around the out-of-the-box limits

Here is what happened

I was asked by a client with an existing Documentum system to help them with document import/export. They were unhappy with the solution that the previous contractor had built, using Taskspace and UCF. They complained that import often failed. They also wanted to add the ability for external systems to automatically import and export documents.
I asked about the kinds of documents they are storing and they turned out to be somewhat a-typical for a Documentum system. I my experience most Documentum systems are filled with documents of kilobytes to megabytes in size, with 1Gb being considered very big. For my customer, most files were between 10 and 50 Gb, with some as big as 500 Gb. That’s BIG.

Documentum has no problem storing files of that size. The challenge is in getting the files from the client to the server and back.
Since they asking for import/export functionality for interactive clients as well as back-end integration with other systems, I proposed to create a webservice using the Documentum DFS (webservices framework).

Now DFS has several options for content transfer:

  • BASE64: This will include the content as part of the reply message to the webservice client. This is the easiest, but also the most restrictive. Only advisable for very small content files.
  • UCF: This is Documentum’s proprietary content transfer method. It has many cool features for xml files and virtual documents and such, but it had proven unreliable in my customer’s environment with the BIG files they have
  • MTOM: The Message Transmission Optimization Mechanism is a W3C standard especially meant to reliably send binary data in SOAP webservice calls.

MTOM looked promising but I had run into boundaries using MTOM for big files in a previous project. When exporting several big files simultaneously, the App Server running the web services would run into Java memory issues. That previous project had considered 10Mb big, so we were sure to run into the same boundaries here.

I solved this by cutting the content transfer up in pieces.
Exporting a file now goes like this:

  • The web service client starts an export by specifying which file it wants to receive. The web service returns an export token (a unique ID for this export request).
  • The webservice client the calls the web service again, supplying the token and the maximum number of bytes he wishes to receive (the default being 1Mb). The web service returns part of the content file using MTOM.
  • The web service client keeps calling until the full content file is transferred.

This very simple protocol turned out to work like a charm, even when simultaneously transferring files of many Gb . We did advice the client to use a separate DFS server machine, so the Documentum content server is not congested with all the disk- and network traffic the big files are causing and TaskSpace can keep running smoothly for the users.

TaskSpace

For the interactive clients we did one more trick so they can use the new export/import webservice.
normally you would have a component on the TaskSpace application server that acts as a web service client, but that would mean that the content would be sent to the application server and the application server would then send it to the user’s browser. That would mean that the big files are sent over the network twice, causing unnecessary delays.
Documentum has a feature called Accelerated Content Services (ACS), but we could not use that in this project.We did find a way to get the content from the DFS server directly to the user’s browser:
We added a little javascript to the export page that calls the export web service and combines all the parts of the file into 1 BIG file.It works, it performs, 1 solution for both interactive and integration use, I am happy !

Let me know what you think

Sander Hendriks

 

 

Lees onze andere blogs

Deze website gebruikt cookies

Met deze cookies kunnen wij en derde partijen informatie over jou en jouw internetgedrag verzamelen, zowel binnen als buiten onze website. Op basis daarvan passen wij en derde partijen de website, onze communicatie en advertenties aan op jouw interesses en profiel. Meer informatie lees je in ons cookie statement.

Kies je voor accepteren, dan plaatsen we alle cookies. Kies je voor afwijzen, dan plaatsen we alleen functionele en analytische cookies. Je kunt je voorkeuren later nog aanpassen.

Accepteren Weigeren Meer opties

Deze website gebruikt cookies

Met deze cookies kunnen wij en derde partijen informatie over jou en jouw internetgedrag verzamelen, zowel binnen als buiten onze website. Op basis daarvan passen wij en derde partijen de website, onze communicatie en advertenties aan op jouw interesses en profiel. Meer informatie lees je in ons cookie statement.

Functionele cookies
Arrow down

Functionele cookies zijn essentieel voor het correct functioneren van onze website. Ze stellen ons in staat om basisfuncties zoals paginanavigatie en toegang tot beveiligde gebieden mogelijk te maken. Deze cookies verzamelen geen persoonlijke informatie en kunnen niet worden uitgeschakeld.

Analytische cookies
Arrow down

Analytische cookies helpen ons inzicht te krijgen in hoe bezoekers onze website gebruiken. We verzamelen geanonimiseerde gegevens over pagina-interacties en navigatie, waardoor we onze site voortdurend kunnen verbeteren.

Marketing cookies
Arrow down

Marketing cookies worden gebruikt om bezoekers te volgen wanneer ze verschillende websites bezoeken. Het doel is om relevante advertenties te vertonen aan de individuele gebruiker. Door deze cookies toe te staan, help je ons relevante inhoud en aanbiedingen aan je te vertonen.

Accepteren Opslaan

Ontdek onze QSEH Star