Getting pySpark and memSQL to work together

Is there any way I can connect memSQL using PySpark? I have a data pipeline using Kafka and a spark. now I want to store data into memSQL. I could find the connector only for scala. Is there a connector for PySpark or Java?

Hi mbhanda2

Yes, there is a way. You can use MariaDB Java connector for both Java and PySpark.
Here is a simple example:

pyspark --driver-class-path "PATH_TO_JAR" --jars "PATH_TO_JAR"
host="172.17.0.2"
port="3306"
database="name"
jdbcUrl = "jdbc:mariadb://{0}:{1}/{2}".format(host, port, database)
properties = {
	"user": "root",
	"password": "password",
}
df = spark.read.jdbc(url=url, table="tb", mode="MODE_NAME", properties=properties)
df.show()

df.write.jdbc(url=jdbcUrl, table="tb", mode="MODE_NAME", properties=properties)

NOTE: sql_mode should be set to “ANSI_QUOTES”.

Thanks Ramzes.
Can you give me the example in java too?

Sure

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.sql.Statement;

public class Main {
    static final String JDBC_DRIVER = "org.mariadb.jdbc.Driver";
    static final String DB_URL = "jdbc:mariadb://172.17.0.2:3306/db";

    public static void main(String[] args) {
        Connection conn = null;
        Statement stmt = null;
        try {
            Class.forName(JDBC_DRIVER);

            conn = DriverManager.getConnection(DB_URL, "user", "password");
            stmt = conn.createStatement();
            String sql = "SELECT * FROM TABLE";
            stmt.executeUpdate(sql);
        } catch (Exception se) {
            se.printStackTrace();
        } finally {
            try {
                assert conn != null;
                conn.close();
            } catch (SQLException e) {
                e.printStackTrace();
            }
            try {
                conn.close();
            } catch (SQLException se) {
                se.printStackTrace();
            }
        }
    }
}

NOTE: mariadb JDBC driver must be accessible from the project (add to classpath or as a project dependency).

Thanx ramzes… I couldn’t find any documentation related to this topic . you saved my project.