Question

Bulk Load 100,000 Legacy Documents

3

I need to load 100,000+ documents from a legacy system into an existing Mendix app. The documents are currently in a directory on a server. My current thought is to: create an account with a service like www.files.com or www.dropbox.com which offers storage and an APIs around their storage offering create a microflow in Mendix that retrieves a batch of files (likely via REST calls), processes them and loads them into Mendix run that microflow in a scheduled event until the files are all loaded However, I am guessing I’m not the first person to have this kind of requirement. I am curious how others have done this kind of thing before. Any and all ideas and feedback welcome! This app is hosted in the Mendix cloud and is currently version 7.14.1

asked 2019-08-03

Mike Kumpf

3 answers

Nikel Kruizinga · Answer 1 · 2019-08-05

It depends on what your requirements are for the migration process. Does this needs to happen at once? I.e. the users still have access to the files on Friday in the old system, but need to be able to access all of them in the Mendix application on Monday? Or is it acceptable if you migrate over the period of a month?

If speed is a requirement, the best option would be to migrate the files locally on a very fast machine. What would be both quick and relatively safe is the following approach:

– Create a SCV with all the metadata for the files.
– Download database and files backup from the cloud
– Run a custom import in your Mendix application for the CSV. Have this create dummy files with size 0.
– Extract the list of generated files from the Mendix database so you know the filenames
– Run a batch script to replace the dummy files with the actual files
– Run an sql script to update the filesizes for the files in the Mendix database.
– Upload database and files to the cloud

This approach lets Mendix create all database records so you are sure you don't mess anything up with regards to generated id's and the use of database sequences.

Ronald Catersels · Answer 2 · 2019-08-04

We use the SFTP module. We have a NAS where we can attach the HD with the legacy documents. Then we retrieve the documents with the SFTP module and process them. And for the process part I have created a system where I can generate document settings so I can retrieve all kinds of metadata from the filename like creation date, type of document etc.

Regards,

Ronald

Dragos Vrabie · Answer 3 · 2019-08-05

While I’ve not tried this kind of migration myself – you could try hosting your files in an Amazon S3 bucket instead, upload them ahead of time and use the Amazon S3 Connector.

Hope this helps.