We’ve recently had production failures because our developers changed an important reference. In this case, we had a view which several procedures and views referenced. A developer made a change to the referenced view by removing columns, which caused several procedures and a view that referenced it to fail. We’re considering whether we should stop this practice, or if there are other ways we can prevent changes to an object that’s being referenced by other objects (in our case, a view).
Overview
In some cases, a view or procedure – which other objects reference – can be a convenient development style as it allows for re-use of code over redevelopment. We’ll look at a contrived example of using a referenced view, by creating one with a view and procedure that both reference it to simulate an example of how a change to it may affect references.
Creating a referenced object
Our first step will involve creating a referenced object, which in this case will be a view. In the below code we create the view, vw_ReferencedView.
1 2 3 4 5 6 7 8 |
CREATE VIEW vw_ReferencedView AS SELECT tt.Id LoadId , tt.LdDate LoadDate , t.TformDate TransformDate FROM tbTform t INNER JOIN tbInitLdr tt ON t.LdId = tt.Id |
Next, we will create two references: a view that references our referenced view vw_ReferencesOtherView and a procedure stp_ReferencesView. In both of these objects, we refer to a column that’s returned from the referenced view, vw_ReferencedView.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
CREATE VIEW vw_ReferencesOtherView AS SELECT vr.* , p.PartitionDate FROM vw_ReferencedView vr INNER JOIN tbPartition p ON vr.LoadId = p.LdId CREATE PROCEDURE stp_ReferencesView @date DATETIME AS BEGIN SELECT r.* , p.PartitionDate FROM vw_ReferencedView r INNER JOIN tbPartition p ON r.LoadId = p.LdId WHERE r.LoadDate > @date END |
What if we removed a column from our referenced view? It’s possible that it may not matter with selecting all columns from the view reference, but it may impact the columns that we use. What if we changed the name of a column? We’ll simulate this by changing LoadId to LdId in the vw_ReferencedView. We can see in the below examples when we try to execute the procedure or select from the second view that both fail:
1 2 3 4 5 6 7 8 9 10 11 |
ALTER VIEW vw_ReferencedView AS SELECT tt.Id LdId , tt.LdDate LoadDate , t.TformDate TransformDate FROM tbTform t INNER JOIN tbInitLdr tt ON t.LdId = tt.Id ---- Error 1: SELECT * FROM vw_ReferencesOtherView |
1 2 |
---- Error 2: EXEC stp_ReferencesView '2017-01-01' |
While we use names for clarity in our example, we should also be careful about names. Some testing software that creates a new test database for the purpose of running unit tests may build objects by alphabetic order. This means that a referenced object with a name starting with “v” will come after an object with a starting letter of “a”, and that will cause issues if the object starting with “a” references the object starting with “v.” If you’re using testing software that does this, any referenced object will need to be named appropriately.
Development with referenced objects
As we can observe, creating a referenced object, such as our example of a view being referenced by other objects, does allow for code re-use and can be a convenient way of development since we don’t have to add an entire block of code to new objects. But we should consider a few scenarios where this is not appropriate.
We want to be sure that our underlying reference, such as a view, in this case, isn’t a query we’ll be changing frequently (if at all). If I’m creating a view that selects some columns from a table along with another joined table, this is seldom an appropriate use case, as the list of columns can change. In addition, unless there’s a performance improvement of an indexed view, there’s little benefit to this design from a code perspective. By contrast, if I’m using a view to order or partition data by a primary key with a result set intended for re-use, this may be more appropriate. In the same manner, with some remote queries from one server to another, I may use a view on one server because the second server’s query will determine what’s needed in that view – thus a change in the query needs will directly affect the view on the other server.
These design considerations demarcate a referenced object that either seldom changes or must change relative to its reference (one reference versus many) against a design where we’re simply re-using a view for convenience without thought to how often we may need to change the referenced view
In the case of multiple object references, as long as we don’t change the referenced view’s definition (vw_ReferencedView in our contrived example), changes to the objects referencing it won’t affect the referenced view. By adding notes about which objects reference it, we can review these objects if we ever need to make changes to our referenced view. These comments can easily be added to the referenced view when we create a new object to reference it. To see an example with this from our above code, in vw_ReferencedView, we would add a comment each time we added another reference:
1 2 3 4 5 6 7 8 9 10 |
ALTER VIEW vw_ReferencedView AS ---- References view: vw_ReferencesOtherView ---- References view: stp_ReferencesView SELECT tt.Id LoadId , tt.LdDate LoadDate , t.TformDate TransformDate FROM tbTform t INNER JOIN tbInitLdr tt ON t.LdId = tt.Id |
The cost to this approach is training new team members to add a comment with the new reference when they create one along with checking if a view has references.
We can also use object metadata from syscomments, which allows us to search for names of objects with objects. In the below code, we look for objects where the text of the object, such as stored procedure or view text, have the words vw_referencedview in it.
1 2 3 4 5 |
SELECT OBJECT_NAME(id) ObjectName , [text] ObjectText FROM sys.syscomments WHERE LOWER([text]) LIKE '%vw_referencedview%' |
While this can provide a helpful shortcut, remember that this will still include objects that may have the name in a comment as a part of the object, not a query that references it. This will not assist if other databases reference our view – such as three remote database servers with a database that references our view. We would have to check every database on every server, which may be expensive relative to how many servers and databases we manage.
In the syscomments query, I use the LOWER function with this query while keeping the object name in lower case. Per Microsoft, the column text is a nvarchar 4000, which means that a procedure or view that exceeds this length will get another entry. If you see duplicate objects, this may be why. In the below image, we see several entries for one procedure because the procedure text of this object exceeds the 4000 nvarchar limit.
Unfortunately, as of the latest update from Microsoft, this is not available in AzureSQL (see below references).
Finally consider that with some changes, it’s less convenient to use a referenced view. If I have 3 procedures and 2 views that reference a view, and I need a column change for one procedure, I change both the procedure and the referenced view, provided it doesn’t impact the other references. If the referenced view is only one query, it would have been faster to have the query directly in the objects over the reference as I would only change one object. Assuming that the performance impact doesn’t change with this design, this scenario favors each object having the query directly.
Conclusion
Designing objects with re-usable queries that other objects can reference may save us significant development time. As long as we consider whether we’ll need to re-develop the referenced object or how we’ll track its references, from comments to metadata, when we do make changes, the references may introduce few problems. If there are situations where we may not need a reference due to a query being temporary or being easy to develop, we may want to consider it within an object over using a reference.
- Data Masking or Altering Behavioral Information - June 26, 2020
- Security Testing with extreme data volume ranges - June 19, 2020
- SQL Server performance tuning – RESOURCE_SEMAPHORE waits - June 16, 2020