An HP Vertica database has 2 main log files. Their names are dbLog and vertica.log. In the majority of cases you will be working with vertica.log. The dbLog file is used by Vertica during database startup. If you have issues starting the database then check dbLog for any recent messages. If the database started successfully then you should be looking at the contents of vertica.log.
How do I locate log files on my cluster?
Both log files are stored in the database catalog. If your database is running you can query it for the location of log files with the following query:
select /*+label(Source: www.dbjungle.com )*/ node_name, replace(replace(storage_path,node_name,''),'/_catalog/Catalog','/dbLog') as dbLog_location, replace(storage_path,'/Catalog','/vertica.log') as vertoca_log_location from disk_storage where storage_usage = 'CATALOG' order by node_name;
Notice that the location of dbLog is the same path on all the nodes in the cluster. Vertica.log has a unique path on each node since it has the node name as part of the path.
If your database is not running or you prefer adminTools you can start adminTools -> navigate to “Configuration Menu” -> navigate to “View Database” -> select your database and hit OK.
How much of the historical log files does Vertica keep?
Vertica uses the logrotate utility to manage log files. When Vertica is installed it will configure the logrotate utility to automatically rotate both database log files (dbLog and vertica.log). By default it will rotate log files weekly. The files will be compressed after rotation and kept on the hard drive for 52 weeks. Archived files are kept in same folder as their corresponding logs. If you need log files from one month ago all you need to do is to go to the folder where the original log file is located and search for the archived log for the appropriate dates.
If you want to change the default log rotation schedule you can easily do so by using adminTools. Command line of adminTools has a logrotate option. Here is a sample command:
$ admintools -t logrotate -d testdb -r weekly -k 24
This command will set the log rotation for the database testdb to a weekly schedule and will keep 24 weeks of logs. Once the changes are made you will need to distribute them across your Vertica cluster.
What log will my user-defined extensions (UDx) write to?
You can code your user-defined functions to write to log files, but those messages can end up in 2 different places. When you create your function you specify in which mode it will run. The possible choices are UNFENCED and FENCED.
Functions in UNFENCED mode use standard vertica.log file.
Functions in FENCED mode use the UDxFencedProcesses.log, which is located in the UDxLogs subfolder of the catalog directory. You can run this SQL to get the exact location of this log file.
select /*+label(Source: www.dbjungle.com )*/ node_name, replace(storage_path,'/Catalog','/UDxLogs/UDxFencedProcesses.log') as UDxFencedProcesses_log_location from disk_storage where storage_usage = 'CATALOG' order by node_name;
UDxFencedProcesses.log has a unique path on each node since it has the node name as part of the path.
Is there a place where I can find load rejections and exceptions?
When you are loading data into Vertica using the COPY command some portion of the data may get rejected. Rejections happen when Vertica can not parse the data you are trying to load. This can happen due to different reasons, eg. “not enough values”, “too many values”, “provided data can not be parsed into specific data type”, “unexpected delimiters”, etc.
During the PARSE phase Vertica tries to parse each line of your file and fit it into a record of a table. If Vertica is not successful it will automatically write a copy of the rejected row to the rejected-data file. At the same time it will write a corresponding error message to the exceptions file. By default, both files will be created in CopyErrorLogs subdirectory in the database catalog.
Here is an example of a statement and its auto generated files:
COPY public.table_test FROM '/tmp/load_me' ON v_testdb_node0001 DELIMITER AS '|' NULL AS '' REJECTMAX 5000 DIRECT;
On the image above you can see a set with a rejected-data file and a corresponding exceptions file.
You can override this default behaviour by specifying REJECTED DATA and EXCEPTIONS parameters to COPY command. If you specify file names with the full path then the same output will be written to the location you requested. If you specify just the filenames without the path then the files will have the names you specified, but will be written to the CopyErrorLogs directory.
COPY public.table_test FROM '/tmp/load_me' ON v_testdb_node0001 REJECTED DATA 'load_me_rejected' ON v_testdb_node0001 EXCEPTIONS 'load_me_exceptions' ON v_testdb_node0001 DELIMITER AS '|' NULL AS '' REJECTMAX 5000 DIRECT;
The CopyErrorLogs directory is part of the database catalog. You can run this SQL to get the exact location of the CopyErrorLogs directory.
select /*+label(Source: www.dbjungle.com )*/ node_name, replace(storage_path,'/Catalog','/CopyErrorLogs/') as CopyErrorLogs_location from disk_storage where storage_usage = 'CATALOG' order by node_name;
The CopyErrorLogs directory has a unique path on each node since it has the node name as part of the path.
Nice post !
Vertica also provide startup.log in same location as Vertica.log that only captures information about node start . It is in Json format and best for tracking node recoveries.
Thank you for this nice post!!
Is there a way to change the log path?
I’d like to have them in another volume to optimize space.
Thank’s again.
I am not aware about supported way to change log path. Logs considered to be a part of Vertica catalog and have a lot of things build around them. For example, Vertica will do log rotate automatically for you. When you run Vertica support tool called scrutinize it will automatically grub logs and put them in to result TAR file, which you will send to Vertica support team.
There only way to move logs around is to move entire catalog. To do this you can specify different locations for “Catalog pathname” and “Data pathname”. Best time to do it is during database creation. Changing catalog location of already existing database is technically involved complicated procedure.