Optimizing a microflow committing 200-300K objects

0
Hi Forum,   I am struggling with the performance of this microflow. It is used to process a large CSV file (~200-300K lines) containing order information each night. I process 500 orders in a batch, then end the transaction and start a new one until I reach the end of the CSV file. For the first couple of batches performance is fine (~15 seconds), but with each following batch the process takes longer, until at the end one batch of 500 orders takes over half an hour to be processed. The microflow should update the order information each night, but at the moment my microflow runs for longer than 24 hours...   Could someone give me some tips on how to improve the performance? I was under the impression that the End/Start Transaction Java actions should free up the memory, but it seems to me that all previously handled batches persist somewhere in memory anyway (hence each batch taking longer and longer).    Any help is much appreciated, thank you and kind regards,   Ruben Nuijten
asked
4 answers
4

I would try to use the process queue for stuff like this. You can split your import into batches (like you already do) and create a QueuedAction for each of them. The actual import then happens in the queue and for the queue, it should not matter how many batches you have.

answered
2

Hi Ruben,

I’ve done something similar recently but have created different processes (1 for CSV processing in a flat table and 1 for processing the data into the domain model) and not experiencing issues. Still, each csv file (+/- 70k records) is being processed within 1 queuedAction, but i use batches of 200 and limited the thread usage so not all files are imported at once. (it runs on an S container).

See below:

QA import File

https://modelshare.mendix.com/models/147efe70-5c30-4464-aaea-24da9e5d2e73/qa_filehistory_importfile

Subs Import:

https://modelshare.mendix.com/models/7f85cf20-6467-4ba6-93a4-ac570411a22b/sf_filehistory_prediction_import

https://modelshare.mendix.com/models/a7ea9df6-2559-4087-8195-a095007fb317/sf_importcsv_predictionv

QA process:

https://modelshare.mendix.com/models/6957ce60-6af0-428f-9232-299ef84e8b9a/qa_filehistory_processdata

Sub:

https://modelshare.mendix.com/models/45dd3528-a0c2-4314-8538-03befb2b4052/sf_filehistory_prediction_process

 

 

answered
2

Try to use the process queue. You could run each of your batches in queues actions. All of them will be handled as an own thread. Good luck!

answered
1

My preferred way for CSV importing is using CsvServices from the AppStore and then using my own Java action

// This file was generated by Mendix Studio Pro.
//
// WARNING: Only the following code will be retained when actions are regenerated:
// - the import list
// - the code between BEGIN USER CODE and END USER CODE
// - the code between BEGIN EXTRA CODE and END EXTRA CODE
// Other code you write will be lost the next time you deploy the project.
// Special characters, e.g., é, ö, à, etc. are supported in comments.

package csvservicesextended.actions;

import java.io.InputStream;
import java.io.StringWriter;
import java.util.Arrays;
import com.mendix.core.Core;
import com.mendix.logging.ILogNode;
import com.mendix.systemwideinterfaces.core.IContext;
import com.mendix.webui.CustomJavaAction;
import csvservices.impl.CsvImporter;
import com.mendix.systemwideinterfaces.core.IMendixObject;

public class CSVImportFromFileDocument extends CustomJavaAction<java.lang.String>
{
	private IMendixObject __CSVFileDocument;
	private system.proxies.FileDocument CSVFileDocument;
	private java.lang.String CSVExportEntity;
	private java.lang.Long MaxRecords;
	private java.lang.Boolean StrictMode;
	private java.lang.Boolean HasHeader;
	private java.lang.String AlternativeHeader;
	private java.lang.String Delimiter;

	public CSVImportFromFileDocument(IContext context, IMendixObject CSVFileDocument, java.lang.String CSVExportEntity, java.lang.Long MaxRecords, java.lang.Boolean StrictMode, java.lang.Boolean HasHeader, java.lang.String AlternativeHeader, java.lang.String Delimiter)
	{
		super(context);
		this.__CSVFileDocument = CSVFileDocument;
		this.CSVExportEntity = CSVExportEntity;
		this.MaxRecords = MaxRecords;
		this.StrictMode = StrictMode;
		this.HasHeader = HasHeader;
		this.AlternativeHeader = AlternativeHeader;
		this.Delimiter = Delimiter;
	}

	@java.lang.Override
	public java.lang.String executeAction() throws Exception
	{
		this.CSVFileDocument = __CSVFileDocument == null ? null : system.proxies.FileDocument.initialize(getContext(), __CSVFileDocument);

		// BEGIN USER CODE
		logger.info("executeAction: " + this.CSVExportEntity + ", " + Arrays.toString(this.CSVExportEntity.split("\\.")));
        CsvImporter csvImporter = new CsvImporter();
        String moduleName = this.CSVExportEntity.split("\\.")[0];
        String entityName = this.CSVExportEntity.split("\\.")[1];
        int maxRecords = this.MaxRecords.intValue();
		InputStream inputstream = Core.getFileDocumentContent(getContext(), 
				CSVFileDocument.getMendixObject());
        String returnValue = "";
        try (
                StringWriter outputWriter = new StringWriter()
           ) {
               csvImporter.csvToEntities(getContext(), outputWriter, moduleName, entityName, inputstream, this.StrictMode, maxRecords, this.HasHeader, this.AlternativeHeader, this.Delimiter);
               logger.info("Done importing: " + outputWriter.toString());
               returnValue = outputWriter.toString();
           }
        return returnValue;
		// END USER CODE
	}

	/**
	 * Returns a string representation of this action
	 */
	@java.lang.Override
	public java.lang.String toString()
	{
		return "CSVImportFromFileDocument";
	}

	// BEGIN EXTRA CODE
	private static ILogNode logger = Core.getLogger("CSV import");
	// END EXTRA CODE
}

Commits all lines in the CSV in separate transactions and well basically you have an input CSV document and an entity to which it needs to be written. At the end it gives you a summary of the output in the following JSON format:

{
  "lines_processed":19250,
  "status":"successfully created objects",
  "numberOfErrors":1,
  "errors":"errors"
}

 

answered