UiPath Logging with Mongo

Configure UiPath Orchestrator with MongoDB for logging

Posted by Pekka Jalonen on 19.11.2019

Logging with Mongo

UPDATE: UiPath released LTS (2019.10.1) version 11.11.2019, which sadly removes support for reading logs from Mongo, but still support writing to it https://docs.uipath.com/releasenotes/docs/2019-10-1#section-logging

This is post about UiPath integration with MongoDB for logging and utilizing Azure Cosmos Database Mongo API. Cosmos DB is scalable NoSQL database which has unlimited capacity for ingesting log data.

It was long waited feature from UiPath to have supported integrations to other logging solutions than Elasticsearch and MSSQL. Especially when using Azure and wanting to maximize PaaS utilization for the backend components.

Orchestrator uses Nlog for logging purposes, so it is possible to use any target for storing logs which Nlog itself supports. I actually build DocumentDB(old name for CosmosDB) integration from Orchestrator in 2016, but at this time, Orchestrator was not able to read logs from other than Elastic or SQL.

Storing the logs in any target is not the only requirement, but also to be able to call/read logs from Orchestrator UI. As with this supported now in 2019.4 version, you are now able to configure Orchestrator to store logs in CosmosDB (Mongo API) and Orchestrator is also able fetch logs from MongoDB. https://docs.uipath.com/orchestrator/v2019-fastTrack/docs/logging-configuration#section-mongodb

Known issue with storing logs in SQL is that it can impact the Orchestrator performance when logs are in the same database as the Orchestrator and robot configuration. To avoid performance issues, best practice is to scavenge logs older than 30-45 days. Especially development logs as trace level logs can easily take 1MB/message.

If you maintain more than two million log entries in the SQL database, you might have some performance issues. For more than that, we recommend using ElasticSearch.

If you use Elasticsearch to store your Robot logs, please note that, in certain circumstances, only 10.000 items can be queried.

Due to an Elasticsearch limitation, Cloud Platform’s Orchestrator tenants are configured to ignore Robot logs larger than 50 kb. The vast majority of logs average around 2 kb.

HTTPS://DOCS.UIPATH.COM/ORCHESTRATOR/DOCS/ABOUT-LOGS#SECTION-LOG-STORAGE UiPath recommended default Mongo Configuration:

<target name="robotMongoBuffer" xsi:type="BufferingWrapper" flushTimeout="5000">
    <target xsi:type="Mongo" name="robotMongo" connectionString="<connection_string>" databaseName="<database_name>" collectionName="<collection_name>">
        <field name="windowsIdentity" layout="${event-properties:item=windowsIdentity}"/>
        <field name="processName" layout="${event-properties:item=processName}"/>
        <field name="jobId" layout="${event-properties:item=jobId}"/>
        <field name="rawMessage" layout="${event-properties:item=rawMessage}"/>
        <field name="robotName" layout="${event-properties:item=robotName}"/>
        <field name="indexName" layout="${event-properties:item=indexName}"/>
        <field name="machineId" layout="${event-properties:item=machineId}"/>
        <field name="tenantKey" layout="${event-properties:item=tenantKey}"/>
        <field name="levelOrdinal" layout="${event-properties:item=levelOrdinal}" bsonType="Int32"/>
    </target>
</target>

FINDINGS:

Unfortunately, the way how UiPath implemented this was not designed for unlimited scaling. There is actually limitations of using CosmosDB with Mongo API for storing logs. NLog target configuration as suggested by UiPath uses tenantKey as the main index for storing and reading logs. This tenantKey is same as what you should use ase partitionKey in Mongo database. In Cosmos, there is a 10GB data limit to a unique partitionKey. NOTE: Cosmos is unlimited on how many partitions you can have, but has 10GB data size limit for a unique paritionKey.

Now when you have tenantKey as partitionKey, it will store all logs from specific Tenant to a single partition and it can easily fill the 10GB in few weeks.

So they have missed how it should have been implemented for Mongo, which is uprising as they have used rolling indexName for Elastic for years where the shard key is “indexName-yyyy.MM.dd” and so it creates new shard each day. With Elastic this is not an issue and Orchestrator can read logs from the indexes. BUT with Mongo, it cannot as it does not know how to append date to calls. So if you change your target to “tenantKey-yyyy.MM.dd”, Orchestrator can write logs to Mongo, but cannot read them anymore.

SOLUTION:

Use split logging for Robot logs. Write logs to both Mongo and SQL, and configure the Cosmos/Mongo to use tenantKey or indexName as partitionKey and append the date to the value like so “indexName-yyyyMMdd”. Disable Orchestrator from reading logs from Mongo. Build the SQL maintenance script to scavenge old logs (other than errors and warnings) from SQL database.

This way your developers can still read logs from Orchestrator UI from reasonable time period, and keep full trail of logs in CosmosDB. You can build another solution to read logs from Cosmos with example Azure Data Factory to csv, or read using Power BI.

Create CosmosDB with Mongo API. Create database and collection (Unlimited), set partitionKey to “indexName”.

Set Nlog targets to both Mongo and database (SQL) for robot logs:

<logger name="Robot.*" final="true" writeTo="database,robotMongo" />

My recommended Mongo configuration:

<target name="robotMongoBuffer" xsi:type="BufferingWrapper" flushTimeout="5000">
    <target xsi:type="Mongo" name="robotMongo" connectionString="<connection_string>" databaseName="<database_name>" collectionName="<collection_name>">
        <field name="windowsIdentity" layout="${event-properties:item=windowsIdentity}"/>
        <field name="processName" layout="${event-properties:item=processName}"/>
        <field name="jobId" layout="${event-properties:item=jobId}"/>
        <field name="rawMessage" layout="${event-properties:item=rawMessage}"/>
        <field name="robotName" layout="${event-properties:item=robotName}"/>
        <field name="indexName" layout="${event-properties:item=indexName}-${date:format=yyyyMM}"/>
        <field name="machineId" layout="${event-properties:item=machineId}"/>
        <field name="tenantKey" layout="${event-properties:item=tenantKey}"/>
        <field name="levelOrdinal" layout="${event-properties:item=levelOrdinal}" bsonType="Int32"/>
    </target>
</target>

Focus is in indexName where we apply year\month in form of yyyyMM, so we will get new index each month. Of course, if you expect to have more log data than 10GB per tenant if you have huge amount of robots in tenant, or use trace level logging, you can adjust it to include also date dd. Of course having all logs partitioned in different indexes daily will also complicate how you eventually will read the logs from cosmos.

Finally set Orchestrator not to try to fetch logs from Mongo, then it will automatically query logs from the SQL database. (If you use App Service, you can set this in Application Configuration Settings and it will overwrite value in web.config)

    <add key="Logs.MongoDB.RobotLogs.Enabled" value="false" />

Don’t forget to create SQL maintenance script to purge old logs from database, or you will soon have database performance issues… you can use example Azure Automation Runbook to execute this once a week. This script will remove logs that are older than 21 days and log level Trace & Information, so we keep warnings and errors.

DELETE FROM dbo.Logs where level in (0,2) and DateDiff(day, TimeStamp, GetDate()) > 21

Posted on 19.11.2019 by Pekka Jalonen