Where’s my data?
When designing reports, one of the trickiest aspects is often just finding the data. There may be no database schema or the original designer is unavailable, often unhelpful.
I’ve used the free version of Redgate’s SQL Search for some time. It’s great for finding column or table names. Especially useful if you want to find all tables that have a relationship to another table’s primary key.
However, it doesn’t allow you to search the actual data in the database, so I decided to work on something to help. I created a table in my database to contain every distinct text value in my entire database, plus the columns and tables it features in.
First a warning: do this on a test database. It’s very resource intensive and I don’t want to be responsible for breaking your main application database! It’s also specific to Microsoft SQL Server. Should be easy enough to modify for anythig else though
The data I’m looking for is normally textual, so I just wanted to be able to find all varchar fields. First, I created a table to hold my search information:
CREATE TABLE [dbo].[aaValues](
[columnValue] [nvarchar](255) NULL,
[columnName] [nvarchar](50) NULL,
[tableName] [nvarchar](50) NULL
) ON [PRIMARY]
Note the column size limits, change if needed. It needs a non-clustered index on columnValue, but adding it later improves the initial insert performance. If you’re feeling really pedantic, add a primary key across all 3 columns.
Next, create a query to generate the INSERTs:
select 'INSERT INTO aaValues SELECT DISTINCT [' + COLUMN_NAME + '], '''
+COLUMN_NAME+ ''',' +''''+ TABLE_NAME+'''' +' FROM [' + TABLE_NAME +']'
from INFORMATION_SCHEMA.COLUMNS cols where (cols.DATA_TYPE = 'nvarchar' or cols.DATA_TYPE = 'varchar') and TABLE_NAME <> 'aaValues'
Looks complicated, but all it’s doing is creating a series of INSERT statements. This is one row of 64 for my little sample database:
INSERT INTO aaValues SELECT DISTINCT [Customer Name], 'Customer Name' , 'Customer' FROM [Customer]
Now, you just need to copy the results (in my case 64 lines) into a new query and run it. You may need to make some changes to the query if you’re not using the default schema (dbo).
Now, add an index to speed up searching:
CREATE NONCLUSTERED INDEX [ValueIndex] ON [dbo].[aaValues]
( [columnValue] ASC)
Finding your data is now easy. In my ‘real world’ use, I wanted to find every table and column that contained a reference to a database called ‘SERVER\SQL’. It was just case of using
SELECT [columnName]
,[tableName]
FROM [aaValues] where columnValue = 'SERVER\SQL'
to find all the tables I needed to examine further.
If you’re having problems copying and pasting the code, chances are that the quotation marks are being ‘prettified’. They should all be just singular quotes (apostrophes).