Day 2: Lines

Day two of the 30 day map challenege is lines! A few weeks ago I wrote a post about stream traversal with SedonaDB. For day two I’ll do this with the river system for Gaspereau Lake (starting at the mouth of the beautiful Gaspereau River!). For anybody not familiar with the Gaspereau River, it features excellent wineries and the best tubing this side of any line you draw.

Let’s get to it!

# pip install "apache-sedona[db]"
import sedona.db

sd = sedona.db.connect()
sd.options.interactive = True

Let’s get ourselves some data! I’ll use the same GeoParquet as in the last post, but while in the last post I downloaded it using curl, I’ve since realized that SedonaDB can load it straight from the http url in a few seconds. I’ll augment the segments right out of the gate with the start/end points…the feature code list I had to iterate on a bit because the Gaspereau River system has a few fun hydrological components I hadn’t spotted in the last post (notably: canals, flumes, dams, and coastal rivers).

url = "https://github.com/geoarrow/geoarrow-data/releases/download/v0.2.0/ns-water_water-line_geo.parquet"

sd.read_parquet(url).to_view("lines", overwrite=True)
sd.sql("""
SELECT
    "OBJECTID",
    "FEAT_CODE",
    geometry,
    ST_StartPoint(geometry) AS start,
    ST_EndPoint(geometry) as end
FROM lines
WHERE
    ST_GeometryType(geometry) = 'ST_LineString' AND
    "FEAT_CODE" IN ('WARV50', 'WARV55', 'WARV59', 'WALK59', 'WACORV59', 'WAFU59', 'WADM59', 'WACA59')
""").to_memtable().to_view("lines_with_ends", overwrite=True)

Next, I need to find the segment at the mouth of the river. Lonboard crashes when I try to feed it the whole layer, so I’ll use the reprojected lon/lat bounding box trick from the last post to view a subset.

import lonboard

box = "POLYGON ((-65.180511 44.741856, -63.838806 44.741856, -63.838806 45.165579, -65.180511 45.165579, -65.180511 44.741856))"
points_crs_json = sd.view("lines_with_ends").schema.field("geometry").type.crs.to_json()

mouth_of_river = sd.sql(f"""
SELECT "OBJECTID", "FEAT_CODE", geometry FROM lines
WHERE ST_Intersects(
  geometry,
  ST_Transform(ST_SetSRID(ST_GeomFromWKT('{box}'), 4326), '{points_crs_json}')
)
""").to_memtable()

lonboard.viz(mouth_of_river)

We’ve got it! Let’s start the network. Unlike the last post where I built a network, in this one I’m going to keep track of the “generation” so we can make a fun animation.

sd.sql("""
SELECT *, 0 AS gen,
FROM lines_with_ends
WHERE "OBJECTID" = 99950 """
).to_view("network", overwrite=True)

sd.view("network")
┌──────────┬───────────┬──────────────────────┬──────────────────────┬─────────────────────┬───────┐
│ OBJECTID ┆ FEAT_CODE ┆       geometry       ┆         start        ┆         end         ┆  gen  │
│   int64  ┆    utf8   ┆       geometry       ┆       geometry       ┆       geometry      ┆ int64 │
╞══════════╪═══════════╪══════════════════════╪══════════════════════╪═════════════════════╪═══════╡
│    99950 ┆ WACORV59  ┆ LINESTRING Z(399395… ┆ POINT Z(399395.4709… ┆ POINT Z(399620.406… ┆     0 │
└──────────┴───────────┴──────────────────────┴──────────────────────┴─────────────────────┴───────┘

Now we’ll do the traversal. I ran into an issue here where repeat items were getting added and used AI to help rewrite the SQL to ensure duplicate OBJECTIDs weren’t added.

network_count = sd.view("network").count()
network_iterations = 0

for gen in range(1, 300):
    sd.sql(f"""
    WITH new_connections AS (
        SELECT
            lines_with_ends."OBJECTID",
            lines_with_ends.geometry,
            lines_with_ends.start,
            lines_with_ends.end,
            {gen} AS gen
        FROM lines_with_ends
        JOIN network ON ST_Intersects(network.start, lines_with_ends.end)
        WHERE lines_with_ends."OBJECTID" NOT IN (SELECT "OBJECTID" FROM network)
    )
    SELECT
        "OBJECTID", geometry, start, "end", gen
    FROM network
    UNION
    SELECT
        "OBJECTID", geometry, start, "end", gen
    FROM new_connections
    """).to_memtable().to_view("network", overwrite=True)

    new_network_count = sd.view("network").count()
    if new_network_count == network_count:
        break
    else:
        network_count = new_network_count
        network_iterations += 1

print(network_iterations)
203

Let’s take a look!

lonboard.viz(sd.sql("SELECT geometry FROM network"))

The lonboard map is a good sanity check, but I’m aiming for something cooler here: an animation so we can watch our stream network “grow” outward. For this we’ll need some summary info from our network.

agg_info = sd.sql("""
SELECT
    max(gen) AS max_gen,
    ST_Envelope_Aggr(geometry) AS env,
FROM network
""").to_pandas()

max_gen, bounds = (agg_info.iloc[0, 0], agg_info.iloc[0, 1].bounds)
xmin, ymin, xmax, ymax = bounds

…to set up the figure:

import matplotlib.pyplot as plt

plt.rcParams['figure.dpi'] = 600
plt.rcParams['savefig.dpi'] = 600

…and plot up to i generations for the ith frame of the animation.

for i in range(max_gen + 1):
    ax = (
        sd.sql(f"SELECT geometry FROM network WHERE gen <= {i}")
        .to_pandas()
        .plot(linewidth=0.7, color="black")
    )
    ax.set_xlim(xmin, xmax)
    ax.set_ylim(ymin, ymax)
    ax.set_xticks([])
    ax.set_yticks([])
    ax.spines["top"].set_visible(False)
    ax.spines["right"].set_visible(False)
    ax.spines["bottom"].set_visible(False)
    ax.spines["left"].set_visible(False)
    plt.savefig(f"anim/gen{i:03d}.png")
    plt.close()

Finally, we can use the Python Image Library to stitch it all together. AI did a great job with the stitching part.

import glob
from PIL import Image

# Get all PNG files in the anim directory
png_files = sorted(glob.glob("anim/gen*.png"))

# Open all images
images = []
for png_file in png_files:
    img = Image.open(png_file)
    images.append(img)

# Append the last one a few times to make it sick
for i in range(20):
    images.append(img)

# Save as animated GIF
images[0].save(
    "network_animation.gif",
    save_all=True,
    append_images=images[1:],
    duration=100,  # milliseconds per frame
    loop=0
)

Tada!

Software Engineer & Geoscientist