In this article, we are going to discuss in detail what a spatial database is and the data type, POINT that is supported by PostgreSQL to perform geographic data analysis.
In my previous article, Install and upgrade PostgreSQL to support Spatial Data, I have demonstrated how to install a PostgreSQL database on a Mac and how to enable the PostGIS extension using which spatial or geographical data can be inserted into the database and spatial queries can be executed. You can also download and install the PostGIS extension on a Windows machine by following similar steps. It is highly recommended that you follow the tutorial to set up an environment first and then continue with this article further on. Once you have correctly installed the PostgreSQL database and enabled the PostGIS extension, you can then follow this article to understand more about the different spatial data types associated with PostGIS.
PostGIS is an add-on library that can be installed to an existing instance of a PostgreSQL database. The PostGIS will add specific functions that allow us to work with geographical data within the PostgreSQL database. If you want to have a descriptive overview of the PostGIS extension and how it works, I recommend you have a look at the official documentation for PostGIS.
Spatial Data Analysis and Why is it required?
Before diving deep into the technical stuff of working with Spatial Data, let us first understand what spatial data analysis is all about and why is it required. In simple terms, the study of patterns, anomalies, and theories within spatial or geographical data is termed Spatial Data Analysis. It is an extended version of data analysis in which the focus is on geographical data rather than normal data. As such, there are a specific set of tools and methodologies that involve such kind of geographical data analysis. A few examples of such Spatial Analysis would be as follows.
- How far is Location A from Location B?
- What are the bus stops within 500 meters from Location X?
- What is the best route from Location X to Location Y?
In order to answer these questions, we need a special set of functions within the databases. These functions are designed to work with geographical data and give us the answers that we are actually looking for. In PostgreSQL, these functions can be made available in the form of an extension known as the PostGIS. This article is just an introduction to the basic data type, POINT, in which the data is stored in the database. Performing spatial analysis with such data is beyond the scope of this article and will be addressed in the future.
Components of the Spatial Data Types in PostgreSQL
In order to work with geographical data, the databases use a special data type. There are multiple data types that deal with geographical data such as points, lines, polygon, etc. In this article, we will specifically look into the POINT data type and understand how to create and work with it. Before moving further, let us also brush our knowledge on how to interpret geographical data in the real world.
If we consider the surface of the earth on a plane, lines of latitude and longitude are used to describe the exact position of a location on that surface.
Figure 1 – Lines of Latitude and Longitude across the surface of the earth – (Source)
As you can see in the figure above, the horizontal and the vertical lines across the surface of the earth can be used to exactly locate places on the surface of the earth. By knowing the latitude and longitude coordinates we can give the exact location of any place in the world. This is known as geolocation. There are various spatial data types that can be used to perform spatial data analysis as follows.
Figure 2 – Geometry Data Types in PostgreSQL
The above figure shows all the available geometrical data types available to work in PostgreSQL. In this article, we will look into the following geographical data types.
- POINT – It is used to define the exact location using latitude and longitude coordinates.
POINT Data Type
A point is an intersection of a latitude and a longitude coordinate. With the combination of a latitude and a longitude coordinate, it is possible to locate any location on the map. In PostgreSQL, a point is a geometry data type that stores the data in the geometry data type. For example, let us consider the following coordinates for London where the latitude is 51.501220 and the longitude is –0.138702.
In order to convert the above into PostgreSQL and represent London, we can write it as
POINT(-0.138702 51.501220)
Notice how the longitude value is placed in the front and the latitude is placed next to it with just a space as a separation between the two. This is known as Well Known Text or WKT in PostGIS. However, this is just textual data and in order to convert it into a geographical object and load it into the database, we are going to use the spatial function ST_GeomFromText(text WKT, integer SRID). In this format, we are going to provide the coordinates as the text and a Spatial Reference ID as the second parameter. The SRID is a spatial reference ID that tells the PostGIS engine in what format is our data, however, there can be many values for SRID which are beyond the scope of this article. For this example, we will consider the SRID as 4326. So, the final representation of the point data type can be written as follows.
ST_GeomFromText(‘POINT(-0.138702 51.501220)’, 4326)
When the above statement is executed in PostgreSQL, the PostGIS engine converts the Well Known Text string value into a valid geometry data type and stores it within the database in a binary format.
Now let us create a table in PostgreSQL and insert geometry data into it. You can use the following SQL query from the gist to create and insert data into your table.
https://gist.github.com/aveek22/daf288a18480ab91be7a17cf33ad7cfc
Once you run the query, you will see that a table has been created in PostgreSQL that contains data for London.
Figure 3 – Storing Geometry data into PostgreSQL
As you can see in the above figure, the column sensorLocation which was provided coordinates with a Well Known Text format has been converted into a geometry data type and the data is stored in a binary format. We can also view the location on the map by clicking the small eye button on the right of the column header.
Figure 4 – Viewing Geometry data on a map in PostgreSQL
As you can see in the figure above, when we select the Geometry Viewer in PGAdmin, the point is being viewed as a blue dot on top of the London city on the map. Since we have only one record in the dataset, only one dot is visible on the map. All such points from the dataset will be displayed on the map that corresponds to its location coordinates.
Alternatively, you can also view the data type of an existing geometry data type by using the following SQL script.
SELECT ST_GeometryType(sensorLocation) FROM SQLShackGeomTest;
Figure 5 – Checking the Geometry Data Type
Conclusion
In this article, we have understood what spatial data is all about and how it can be stored in a PostgreSQL database. We have also learnt the spatial data type POINT associated with PostGIS and its significance. Points are the basic block of working with any spatial data. A combination of points can be used to make other geographical objects such as lines or polygons etc. We will learn more about these in the upcoming articles.
- Getting started with PostgreSQL on Docker - August 12, 2022
- Getting started with Spatial Data in PostgreSQL - January 13, 2022
- An overview of Power BI Incremental Refresh - December 6, 2021