Julia - MySQL への接続・データを CSV・JSON で保存する

環境

  • ArchLinux
  • MySQL 8.0
  • Julia 1.1.1
  • MySQL.jl 0.7.0
  • JSON.jl 0.21.0
  • Tables.jl 0.2.10

Docker で MySQL を立ち上げる

$ docker run --name mysql -e MYSQL_ROOT_PASSWORD=test -d -p 3306:3306 mysql:8.0

# 以下 MySQL 8.0 の場合は必要
$ cd ~/.julia/packages/MySQL/owtkt/deps/usr/lib
$ mv libmariadb.so.3 libmariadb.so.3.bk
$ cp /usr/lib/libmysqlclient.so libmariadb.so.3

共有ライブラリを入れ替えないと以下のようなエラーが出た。

ERROR: LoadError: (2059): Plugin caching_sha2_password could not be loaded: lib/mariadb/plugin/caching_sha2_password.so: cannot open shared object file: No such file or directory

参考

using Pkg
Pkg.pkg"add Tables MySQL JSON"

import MySQL, JSON
import Tables: rowtable, columntable
using DelimitedFiles
import Random: randstring, seed!
seed!(0)

# MySQL へ接続
host = "127.0.0.1"
user = "root"
password = "test"
conn = MySQL.connect(host, user, password, port=3306)

# 適当なデータベースとテーブルを作る
MySQL.execute!(conn, "create database if not exists testdb")
create_table = """
create table if not exists testdb.testtb (
    id int,
    name varchar(255),
    rand decimal(10,8)
)
"""
MySQL.execute!(conn, create_table)

# 適当なデータを入れる
for id in 1:1000
    sql = """insert into testdb.testtb values
            ($(id), \"$(randstring())\", $(rand()));"""
    MySQL.execute!(conn, sql)
end

# select で取ったデータを JSON に保存
data = MySQL.Query(conn, "select * from testdb.testtb;")
json_data = JSON.json(rowtable(data))
write("test.json", json_data)

# data = MySQL.Query(conn, "select * from testdb.testtb;")
# columntable(data)

# CSVに保存
data = MySQL.Query(conn, "select * from testdb.testtb;")
writedlm("test.csv", data, ',')

こんな感じのJSONファイルができる。 test.json

[
  {
    "id": 1,
    "name": "0IPrGg0J",
    "rand": 0.361828
  },
  {
    "id": 2,
    "name": "ONTEB5dh",
    "rand": 0.9678
  },
...
  {
    "id": 998,
    "name": "FTmSKB4K",
    "rand": 0.479956
  },
  {
    "id": 999,
    "name": "MNnZshH1",
    "rand": 0.442873
  },
  {
    "id": 1000,
    "name": "VLpDa0r7",
    "rand": 0.87176
  }
]

rowtable, columntable の違い

rowtable を使うと行ごとに NamedTuple ができる。

julia> rowtable(data)
1000-element Array{NamedTuple{(:id, :name, :rand),Tuple{Union{Missing, Int32},Union{Missing, String},Union{Missing, Dec64}}},1}:
 NamedTuple{(:id, :name, :rand),Tuple{Union{Missing, Int32},Union{Missing, String},Union{Missing, Dec64}}}((1, "0IPrGg0J", 0.361828))  
 NamedTuple{(:id, :name, :rand),Tuple{Union{Missing, Int32},Union{Missing, String},Union{Missing, Dec64}}}((2, "ONTEB5dh", 0.9678))
 NamedTuple{(:id, :name, :rand),Tuple{Union{Missing, Int32},Union{Missing, String},Union{Missing, Dec64}}}((3, "4LVno7oc", 0.156637))  
 NamedTuple{(:id, :name, :rand),Tuple{Union{Missing, Int32},Union{Missing, String},Union{Missing, Dec64}}}((4, "uaJ6CBGW", 0.035345))  
 NamedTuple{(:id, :name, :rand),Tuple{Union{Missing, Int32},Union{Missing, String},Union{Missing, Dec64}}}((5, "2iSNQhb3", 0.838075))  
 NamedTuple{(:id, :name, :rand),Tuple{Union{Missing, Int32},Union{Missing, String},Union{Missing, Dec64}}}((6, "D3AaC1PT", 0.548342))
...

一方で columntable を使うと列ごとの NamedTuple ができる。

julia> columntable(data)
(id = Union{Missing, Int32}[1, 2, 3, 4, 5, 6, 7, 8, 9, 10991, 992, 993, 994, 995, 996, 997, 998, 999, 1000], name = Union{Missing, String}["0IPrGg0J", "ONTEB5dh", "4LVno7oc", "uaJ6CBGW", "2iSNQhb3", "D3AaC1PT", "wlXVZlHn", "p5t3gIVm", "fqZm9qxn", "qUT6Tdav""CCmmVuLs", "YvGlFamh", "o9yd2pJL", "gOcdoaHF", "cmwiizfA", "7NDyuHil", "xyRbhbQn", "FTmSKB4K", "MNnZshH1", "VLpDa0r7"], rand = Union{Missing, Dec64}[0.361828, 0.9678, 0.156637, 0.035345, 0.838075, 0.548342, 0.938857, 0.068223, 0.080023, 0.3824530.844706, 0.717869, 0.382504, 0.092114, 0.632793, 0.415132, 0.446782, 0.479956, 0.442873, 0.87176])

github.com