Blog

Victor is a full stack software engineer who loves travelling and building things. Most recently created Ewolo, a cross-platform workout logger.

    Extracting data from mysql

    I recently had the need to extract some data from a MySQL database running in production and while some methods are brutally simple and some a little more involved, all of them come with a few caveats.

    Database dump

    The mysqldump command can be used to dump a database to disk. Note that in this case the entire database (structure and data) is dumped as a single sql file containing the requisite commands which can be used to re-create the database. The following is an example of a script that can be used to create a database backup:

    
    #!/bin/sh
    
    DBHOST=${1}
    DBPORT=${2}
    DBUSER=${3}
    DBNAME=${4}
    POSTFIX=${5}
    DATE=`date +"%Y%m%d-%H%m"`
    SQLFILE=~/db-backups/$DBNAME-${DATE}${POSTFIX}.sql
    
    mysqldump --no-tablespaces --add-drop-table --host=$DBHOST --port=$DBPORT --user=$DBUSER -p $DBNAME > $SQLFILE
    gzip -f $SQLFILE  
    

    As can be noted, running the script above should prompt you for the password. Finally, note that is mostly an option for smaller databases that can fit within a reasonable file size.

    SELECT ... INTO OUTFILE

    The following command is a nice clean way to extract data from a table into a csv file:

    
    (SELECT 'notificationId','text','status')
    UNION
    (SELECT notificationId, text, status 
    FROM Notification
    INTO OUTFILE '/tmp/notification.csv' 
    FIELDS ENCLOSED BY '"' 
    TERMINATED BY ',' 
    ESCAPED BY '"' 
    LINES TERMINATED BY '\n');
    

    Running the above command gave a mysql ERROR 1045 (28000): access denied which was solved by granting the File privilege to the user in question: sudo mysql and then GRANT FILE on *.* to exampleuser;. You might need to exit your mysql shell and log back in for the privilege to take effect.

    Running the command a second time gave a different error: ERROR 1290 (HY000): The MySQL server is running with the --secure-file-priv option so it cannot execute this statement. The issue here is that the secure-file-priv option does not allow mysql to read/write from outside of a particular directory. This directory can be found by running the command: SHOW variables LIKE 'secure_file_priv';. In this case it was /var/lib/mysql-files. Thus changing the above command to the following resulted in the csv being created:

    
    (SELECT 'notificationId','text','status')
    UNION
    (SELECT notificationId, text, status 
    FROM Notification
    INTO OUTFILE '/var/lib/mysql-files/notification.csv'
    FIELDS ENCLOSED BY '"' 
    TERMINATED BY ',' 
    ESCAPED BY '"' 
    LINES TERMINATED BY '\n');
    

    Finally, note that /var/lib/mysql-files is user protected and any files created belong to the mysql user as well.

    One issue with the above command is that the target file must not already exist. A simple way to solve this problem is to add a timestamp. This can be achieved via a prepared statement:

    
    SET @TS = DATE_FORMAT(NOW(),'_%Y_%m_%d_%H_%i_%s');
    
    SET @FOLDER = '/var/lib/mysql-files/';
    SET @PREFIX = 'notifications';
    SET @EXT    = '.csv';
    
    SET @CMD = CONCAT("SELECT * FROM notifications INTO OUTFILE '",@FOLDER,@PREFIX,@TS,@EXT,
                "' FIELDS ENCLOSED BY '\"' TERMINATED BY ',' ESCAPED BY '\"'",
                "  LINES TERMINATED BY '\r\n';");
    
    PREPARE statement FROM @CMD;
    
    EXECUTE statement;
    

    Finally, in case you have null values it is advisable to convert them to something useful as follows:

    
    (SELECT 'notificationId','text','status')
    UNION
    (SELECT notificationId, text, IFNULL(status, "unknown") 
    FROM Notification
    INTO OUTFILE '/var/lib/mysql-files/notification.csv'
    FIELDS ENCLOSED BY '"' 
    TERMINATED BY ',' 
    ESCAPED BY '"' 
    LINES TERMINATED BY '\n');
    

    In my case, I needed to extract data from a production instance where I did not have easy access via a db tool like MySQL workbench. If you do have access via a tool, you're probably better off extracting data using the tool as it will be more robust and provide more features :).

    HackerNews submission / discussion

    Back to the article list.

    SmallData newsletter

    Subscribe to get articles as they are published direct to your inbox!