Using DSE geometric types in Spark

DSE geometric types can be used in Spark.

About this task

DSE geometric types can be used in DSE Spark. This example shows how to create a table containing geometric types using CQL, and then read and write to this table in the DSE Spark shell.

Procedure

  1. Start cqlsh.

    cqlsh
  2. Create the test keyspace for the geometric data.

    CREATE KEYSPACE IF NOT EXISTS test WITH REPLICATION = { 'class': 'SimpleStrategy', 'replication_factor': 1 };
  3. Create the table to store the geometric data.

    CREATE TABLE IF NOT EXISTS test.geo (
        k INT PRIMARY KEY,
        pnt 'PointType',
        line 'LineStringType',
        poly 'PolygonType');
  4. Insert a test row.

    INSERT INTO test.geo (k, pnt, line, poly) VALUES
        (1,
        'POINT (1.1 2.2)',
        'LINESTRING (30 10, 10 30, 40 40)',
        'POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))');
  5. Exit cqlsh.

    QUIT;
  6. Start a Spark shell.

    dse spark
  7. Import the geometric types, type converters, and save modes.

    import com.datastax.driver.dse.geometry._
    import com.datastax.spark.connector.types.DseTypeConverter.{LineStringConverter, PointConverter, PolygonConverter}
    import org.apache.spark.sql.SaveMode
  8. Read the Point value from the test row in test.geo.

    val results = spark.read.cassandraFormat("geo","test").load().select("pnt").collect()
    val point1 = Point.fromWellKnownText(results(0).getString(0))
  9. Write a Polygon value to a new row in test.geo.

    val polygon1 = new Polygon(
        new Point(30, 10),
        new Point(40, 40),
        new Point(20, 40),
        new Point(10, 20),
        new Point(30, 10))
    val df = spark.createDataFrame(Seq((2, polygon1.toString))).select(col("_1") as "k", col("_2") as "poly")
    df.write.mode(SaveMode.Append).cassandraFormat("geo", "test").save()
  10. Check that the value was written to the test.geo table.

    spark.read.cassandraFormat("geo","test").load().show()
    +---+--------------------+---------------+--------------------+
    |  k|                line|            pnt|                poly|
    +---+--------------------+---------------+--------------------+
    |  1|LINESTRING (30 10...|POINT (1.1 2.2)|POLYGON ((30 10, ...|
    |  2|                null|           null|POLYGON ((30 10, ...|
    +---+--------------------+---------------+--------------------+

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com