Growing into log management
This might sound strange at first: why would I suggest managing logs to translation companies? In smaller organizations with simpler operations, text log files might be perfect. You might not have that much troubleshooting to do, and when you need to investigate the log files of your translation tool for example, it’s just fine that somebody needs to access the server machine running it, grab the log files, fire up a text editor, and analyze away. It’s about growth: larger and more mature organizations have typically already invested into log management, while smaller ones that plan to grow should invest into it to prevent logs from becoming a growth inhibitor. Here’s why:
The bigger you get, things are likely to get more complex for identifying and troubleshooting technical problems. You probably use more applications, and more features of your key applications in more complex ways. Your applications typically get connected or integrated for data exchange and automation, but then when something goes wrong, it might not always be evident which application is the source of an issue.
Especially for large organizations, cyberattacks have increased security consciousness. In some organizations, direct access to the server machine running your apps may be restricted to very few people, and they might not always be immediately available. And the people with the right access are often the IT folks, not the same people who are able to troubleshoot application specific problems.
Most organizations rely on several or many applications as their daily drivers. There might be more than one translation tool, and further dedicated applications to manage vendors, customers, sales, and so on. Each of them may fill their own log files (or maybe databases), at their own special locations, and probably in different formats.
Even if you use one translation tool like memoQ exclusively, you might have separate log files for the central application (memoQ TMS), for the web interfaces (memoQweb), a database with events for memoQ Content Connector, and finally, log files of the individuals running the memoQ desktop client. (Thinking of this, I might be missing several other specialized log files memoQ TMS has.) Your single translation tool turns out to be a family of smaller tools, and logs can already be a bit of a headache within that family.
What you get out of log management
The motivations for log management are in fact similar to the motivations to introduce a translation tool. You introduced a translation tool because it’s not feasible for your translators to work in many different applications that were never designed with translation in mind. You introduced a translation tool because it’s full of translation management and productivity features. (Surely, your translators would need a lot of training to be effective in FrameMaker, for example, and even if they learned it, they would never be very fast in it.) You introduced it so that you have central management and control over your ongoing and finished translations. Same with log management: you want to standardize, centralize, simplify, and optimize. So, what does a log management system give you? (From here, I will refer to Graylog, which is a log management system I have experience with. It’s one of the solid choices, not the only one. I am not affiliated with them.)
Your application experts (the people that can actually make sense of application logs, not the IT folks) can fire up their web browser, start an app like Graylog in mere seconds, and search or filter away to find the root cause of a problem.
It’s easy to check if an error has ever happened, or if it has happened today, this month, and so on. You can filter by keywords in error messages, or by severity, or, when applicable, even by the name of the user that was affected. Or all of these at once when you are looking for something very specific.
Graylog is built to manage huge amounts of logs just fine, without performance issues. You can also set up rules to keep log data only for a certain amount of time like one year, and then discard it.
Logs from all your applications and systems are in one place. Of course, you can filter by the source app or system.
As Graylog “consumes” your log data, it is standardized, and you no longer need to remember any details about different log file formats.
You can set up alerts to get a direct message in email or a communication app if critical issues happen. Or, when you are troubleshooting something specific, you can have a message sent to you by Graylog when relevant errors or events happen. Or you can get notified if a certain type of error happens too many times.
Unlike with text editors, the user experience and the interface is built specifically for managing log data efficiently.
How log files work
Before explaining how log management works and is set up in practice, it’s useful to take a step back and look at log files first. I will use memoQ as an example, again because of familiarity only. I’m not affiliated with them.
Like many complex applications, memoQ has log levels and severities for log entries. The log level is the minimum severity for a message to be included in the log file. If you set your log level to “error”, then only errors are included, and less severe messages like warnings are not written to your log file. It’s about balancing how much you want to know about what is going on: with a high log level like “error”, you might not be able to find out about events like who deleted a project by mistake, for example. If you set your log level to “verbose”, the log files will be extremely detailed and, for a busy team, real huge. Your memoQ TMS will fill a log file until it reaches a size of 10 MB, then it will start a new one. If your translation teams are busy, or especially if you have some repeated errors or warnings in your logs, you might actually end up with many such 10 MB files daily. Unlike in Graylog, you cannot set up a “retention policy” to keep old log files for a limited time only.
Log files consist of log entries. A log entry contains information about one event like an error, and can be one or more lines of text, with its own anatomy. It has a date and time when the event occurred. It may have the name of the affected user when applicable. As I mentioned earlier, it may have a severity level. If it is about an application error, it is likely to contain the error details and the stack trace. There are infinite different ways to write all this information into text, including the different ways you could format the date and time, as well as the order and separation method for including the rest of the information like username or severity. So, it is just natural that standard file formats have been defined for them: https://graylog.org/post/log-formats-a-complete-guide/. Unfortunately, memoQ has its own proprietary format, but it is relatively straightforward to read it. Here’s an example:2024.10.11 01:21:39.5294500 <unknown> (Error) (Translator_KO_114) (23,23) An error occurred in the translation memory.
TYPE:
MemoQ.Common.Application.TMIdentifiedException
MESSAGE:
Source name:"memoq.mycompany.com"; TM ; Guid:"bbb4e818-8ffa-4b6a-bf5a-fe03cbc0279f"; Name:"MY TRANSLATION MEMORY"; Message:"Concordance error."
[MY TRANSLATION MEMORY] [151] Concordance result: cannot convert word indices to char indices
SOURCE:
MemoQ.Networking.BLL
CALL STACK:
at MemoQ.Networking.BLL.TranslationMemory.ServerTranslationMemoryManager.GeneralHandleException(Exception e, Guid tmGuid, String tmName, String messageResourceId)
First, we have a date and time, then the software module inside memoQ that had the error (which in this case is “unknown” for some reason), then the severity of the event and the username. I have to admit that after all these years, I have no idea about the pair of numbers that come after the username. Finally, we have a summary message for the error, followed by more details on further lines, including the technical details of the error, which is not fully included here. The entry is fairly human-readable, but to be able to feed it into the database of your log management system (Graylog) you need to define exact parsing rules to tell Graylog what is what: it needs to read the user name into its own field, and the timestamp, and so on. If the developers of an application use a standard log file format, this extra step of setting up custom parsing for the log file can be avoided. I also carefully “anonymized” the username and any information that belongs to the customer, like the name of the translation memory that could have pointed to a customer name or a product name.
How log management works
I do not want to go very deep here, I’ll just explain the basics. Log files are saved by applications and other pieces of software. The log files are typically placed somewhere in the file system of the same machine where the application is running. Sometimes they might go into a database, not a file. However, a log management system like Graylog is likely running on a completely different machine, or, increasingly, at a provider in the “cloud”. So, the log info needs to be sent from the application machines to the log management system.
There are various small pieces of software called “log collectors”, which must be installed on the machines you are collecting log data from. The log collectors watch log files or log databases (like the Windows Event Log) for new log entries to be able to send them over to Graylog. Graylog has a helper application called Graylog Sidecar that manages the various log collectors. In practice, you install Sidecar on the machines you are “watching”, and it will install the collectors for you as needed. There are a few boring steps here, like configuring the address of Graylog, or pointing your log collector to the log file or log database, and so on. As I pointed out earlier, some applications may have their own proprietary log file formats, in which case you will need to configure your log collector to parse the log files and entries correctly, to separate the timestamp, severity, etc. to the correct fields in the Graylog database.
Log anonymization
Often times, you would need to involve somebody else in a troubleshooting effort, for example, the support team of your translation tool. In a translation company, there are rules to protect information that belongs to your customers: you might not be able to disclose the customer's name, or information about a product you are localizing or translating. You might also not want to share the names of your translators. All of these types of information could occur in the log files, especially if high verbosity is configured. Using Graylog, you can set up automation to pre-process the log data before it is saved to the Graylog database, to delete or anonymize this type of information. However, it can be a double-edged sword: if you “hide” the translator’s username this way, then you won’t be able to filter for errors that the user had.
“Thinking of this, I might be missing several other specialized log files memoQ TMS has”
I think I have a complete list that I can share with you or your readers, along with paths, if that’s of interest to anybody.