Fiddling with GPS data

Last year, I collected some data from my car-mounted GPS, in order to update openstreetmap.org with a side-road into the lake at my cottage. As my GPS is only accurate to about 3 metres, and (on the drive along this side-road) often less accurate than that, I collected trackpoints from several drives up and down the side-road. This permitted me to refine the openstreetmap roadway path with a bit of accuracy, and add that side-road to the public map. This year, I hope to make many more trips to the cottage than I did last year. And, that will give me an opportunity to collect a lot more route data and refine my openstreetmap submission even more.

The biggest problem I had last year was extracting just the trackpoints related to the side-road from the much bigger collection of trackpoints that the GPS saved in it's GPX file. It took a fair amount of preprocessing (through gpsbabel) and hand-editing, but I eliminated all the extraneous trackpoints and metadata, and had enough left to import into openstreetmaps. This year, I've decided to automate some of that trackpoint filtering, so as to make the map update easier.

I wrote two programs: trkpoints.c extracts trackpoints from the GPS GPX xml file, and writes the relevant timestamp, latitude, longitude and altitude information to a comma-seperated-values file. trkgpx.c reads the CSV file, and creates a GPX xml file suitable for use on openstreetmaps.

The hardest part of this development was to find a way to evaluate each trackpoint, and discard those that were outside my area of interest. I found a way, by using haversines. For eack trackpoint, I would use the haversine formula to calculate it's distance from a known point (the parkinglot at the end of the side-road); if the trackpoint occurred within a specific distance (for me, 500 metres) from that point, I would include it in the output CSV, otherwise, I'd discard it.

This left me with a CSV containing only those GPS readings that were within a half-kilometre of the end of the road, and that covered the whole side-road, and a little bit of the approach. Just enough to load into openstreetmaps, to (re)draw the path of the side-road.

(For what it's worth, I chose to store these intermediary results in a CSV so that I could use text tools to store several trips worth of individual readings in the one file. Each trip, I would offload the GPS tracking data, filter and convert it, and concatenate the results to the single CSV file. When I had stored enough passes along the side-road into the CSV, I would then convert the CSV back into a GPX file for use on openstreetmaps.)

So, another COVID-19 "shelter in place" timewaster that wasn't really a waste of time.

AttachmentSize
Binary Data trkpoints.tar.gz12.28 KB
Binary Data trkpoints.2022-03-11.tar.gz13.26 KB
System Management: 

Comments

The two programs (trkpoints.c and trkgpx.c) facilitate the building of features on openstreetmaps.org from tracking data collected by a GPS.

trkpoints
trkpoints.c extracts trackpoints from the GPS GPX xml file, and writes the relevant timestamp, latitude, longitude and altitude information to a comma-seperated-values file. The program can extract all trackpoints, or a selection of trackpoints that occur within a specified distance from a specified location. This way, you can extract only those trackpoints that are relevant to a specific openstreetmap feature.

The program writes the extracted data in CSV format to stdout. For each line,

  • cell 1 contains the timestamp of the GPS reading, in unix time format,
  • cell 2 contains the GPS latitude reading,
  • cell 3 contains the GPS longitude reading,
  • cell 4 contains the GPS altitude reading, and (if extracting by range)
  • cell 5 contains the distance in Metres that this reading is from the selected origin point.
  • cell 6 contains the direction of travel as a compass bearing (0 = N, 90 = E, etc)
  • cell 7 contains the distance from the previous gps reading in metres (Note: if range-limiting, the previous reading might not show)
  • cell 8 contains the computed velocity in Km per hour

Note that both the range-limiting logic, and the value stored in cell 5 are computed using haversines. I derived my haversine computation from a formula found at https://en.wikipedia.org/wiki/Haversine_formula.
For further information regarding haversine computations, you might want to visit:

Note also that the program computes the direction of travel stored in cell 6
using forward azimuth equations found at

As the program writes the CSV file in text format, you can capture and amalgamate many distinct executions of the program by concatinating the program output to a previously-created CSV file.

Note that this program uses Michael R. Sweet's Mini-XML parsing library to read GPS GPX files.

Usage:

trkpoints [origin_latitude origin_longitude range_from_origin]

Examples:

# extract only trackpoints within 1 kilometre (1000 metre) of Toronto City Hall
trkpoints 43.653443 -79.386278 1000 <DriveLog.gpx | sort >>TorontoCityHall.csv
# extract all trackpoints
trkpoints <DriveLog.gpx | sort >>AllPoints.csv

trkgpx
trkgpx.c reads a CSV file (on stdin), extracts timestamp, latitude, longitude and altitude information from the CSV rows, and writes (to stdout) a GPX xml file suitable for use on openstreetmaps. The program expects the CSV file to contain rows such that

  • cell 1 contains the timestamp of the GPS reading, in unix time format,
  • cell 2 contains the GPS latitude reading,
  • cell 3 contains the GPS longitude reading, and
  • cell 4 contains the GPS altitude reading

and expects the file such that lines are in ascending order of timestamp.

When timestamps differ by more than 5 minutes (default, overridable by commandline argument), the program will save the new readings in a different "track" than the previous readings. The "name" of each track defaults to the date and time of the first reading; a commandline argument will modify this value a bit.

Note that this program uses Michael R. Sweet's Mini-XML parsing library to write GPS GPX files.

Usage:

trkgpx [-i interval_in_seconds] [-n name]

Example:

# create a GPX file consisting of only those trackpoints from downtown Toronto
sort TorontoCityHall.csv | trkgpx -n "Downtown Toronto" >DowntownToronto.gpx

I've modified and updated the trkpoints program to create a more detailed CSV file. For each selected GPS trackpoint, the new output will contain a line of 11 comma-separated cells:

  1. cell 'A' contains the timestamp of the GPS reading, in unix time format (Note: conversion ignores any timezone indicator or offset)
  2. cell 'B' contains the GPS latitude reading,
  3. cell 'C' contains the GPS longitude reading,
  4. cell 'D' contains the GPS altitude reading in metres,
  5. cell 'E' contains the distance in metres that this reading is from the selected origin point,
  6. cell 'F' contains the distance in metres from the previous gps trackpoint to the current gps trackpoint. (Note: if range-limiting, CSV file might not contain the previous gps trackpoint)
  7. cell 'G' contains the computed velocity in Km per hour from the previous gps trackpoint to the current gps trackpoint
  8. cell 'H' contains the "entry" bearing from the previous gps trackpoint to the current gps trackpoint. (Note: bearing ranges from 0.0 and 360.0 with 0 == North, 90 = East, 180 = South, 270 = West)
  9. cell 'I' contains the distance in metres from the current gps trackpoint to the next gps trackpoint. (Note: if range-limiting, CSV file might not contain the next gps trackpoint)
  10. cell 'J' contains the computed velocity in Km per hour from the current gps trackpoint to next gps trackpoint
  11. cell 'K' contains the "exit" bearing from this gps point from the current gps trackpoint to the next gps trackpoint. (Note: bearing ranges from 0.0 and 360.0 with 0 == North, 90 = East, 180 = South, 270 = West)
Notes:
◊ The CSV will not contain trackpoints outside of the specified range. This will not interfere with the previous-to-this and this-to-next fields on trackpoints within the range.
◊ Cell 'E' will be empty if the user has not specified an origin and range.
◊ Cells 'F', 'G', and 'H' will be empty for the first trackpoint in a track
◊ Cells 'I', 'J', and 'K' will be empty for the last trackpoint in a track