SQL Everything

Access

SQL Everything

General

Microsoft provides a set of database access components (see Wikipedia), of which OLE DB and its  JET installable ISAM drivers can be used to access arbitrary data sources directly from within SQL.

SELECT

The basic syntax for an SQL SELECT statement is

SELECT *
  FROM tblTable

You might already know the option to select data from a table in a different Microsoft Acess database

SELECT *
  FROM tblTable
    IN 'c:\data.accdb'

or without the IN keyword by using square brackets instead of single quotes and a dot (.) as separator.

SELECT *
  FROM [c:\data.accdb].tblTable

Which of these syntax variants you use, depends on your personal preferences and the use case that you are dealing with. The path can be a local path, a UNC path, HTTP or FTP path; this also holds true for all the following other sources.

Select from other sources

Let's get to the interesting part and select data from other sources than Microsoft Access databases. Generally speaking every source, that could be interpreted as table-like data, is a possible candidate for linking the data via OLE DB into Access.

Microsoft Excel

You can select the content of a worksheet with

SELECT *
  FROM [Sheet1$]
    IN 'c:\data.xlsx'[Excel 12.0 XML;]

Note, that the name of the worksheet ends with a dollar sign, the path of the workbook is written in single quotes and that it is followed by a file format specifier and the extended properties / options in square brackets.

File format

There are two file formats of workbooks:

Type Extension Official type
Binary XLS Excel 8.0
XML XLSX Excel 12.0 XML

The binary format has been used as default format until Microsoft Excel version 2003, the XML format has been specified as Office Open XML by Microsoft (not to mix up with OpenOffice and LibreOffice, which use the Open Document Format).

Ranges

In Microsoft Excel you want to get the cell contents of the ranges that you specify. There are three types of ranges:

Type Example
Named range [Name]
Worksheet [Sheet$]
Worksheet with range [Sheet$A1:B5]
Structured reference Table[Column]

Please note, that the delimiter between a worksheet name and a cell range is here a dollar sign and not an exclamation mark. Currently the selection of ranges by structured references is not possible, which will hopefully change in the future.

Options

You can specify extended properties for the data, in order to control the way that it is linked:

Option Description
HDR Header; HDR=0 (default) means no header, HDR=1 means that the first record is considered as header for the data.
IMEX IMEX is the abbreviation for "intermixed"; the driver handles data always as intermixed, i. e. it is linked as string, when IMEX=1 is set. Otherwise the data will be analyzed: the driver reads the first rows of a column, decides what data type the column would contain and tries to cast the rest of the data accordingly.

Text files

Data is very often transferred in text file formats: comma separated values (CSV), tab separated values (TSV) and other delimited files. The basic syntax for linking text file data is as follows:

SELECT *
  FROM data.csv 
    IN 'c:\'[TEXT;];

Note, that the text file serves as "table" and the folder it is contained in servers as "database". The extended properties start with the keyword TEXT;.

Options

The extended properties for text files are:

Option Description
FMT Format; the format of a text file can either be Delimited or Fixed. It is possible to specify the delimiter in the registry, but I strongly advise against doing so, you should rather specify the properties of the text file in a schema.ini file (see next section).
IMEX. IMEX works as described above.

Schema

You will be disappointed in most cases, that you link text files into your database, as there are many variants of text files and most data suppliers like to apply their own settings. The perfect and at the same time powerful solution is the specification of a schema.ini file in the directory of the text file. Please read the documentation and try importing different types of text files in order to explore the options and their results.

HTML

It is also possible to link HTML tables and lists. This comes in handy when you need to use online published data for your database. The HTML feature is limited in a few ways, but is still worth exploring. The basic syntax looks as follows:

SELECT * 
  FROM [Table] 
    IN 'c:\data.html'[HTML Import;HDR=No;]

Types

It is noteworthy that you can link HTML tables as well as lists (ol, ul), which results in four types, which are retrievable this way:

HTML element Description Example
<table> Table without caption [Table]
<table> Table with caption [The caption]
<ol> Ordered list code>[@Table]</code
<ul> Unordered list code>[@Table]</code

The tables and lists are numbered consecutively throughout the HTML document, starting with "no number": Table, Table1, Table2 and so on. The table reference has to be put in square brackets, the lists have in addition to be prefixed with an at sign (code>@</code).

Options

Option Description
HDR Header; HRD=0 (default) means no header, HDR=1 means that the first record is considered as header for the data.
IMEX IMEX is the abbreviation for "intermixed"; the driver handles data always as intermixed, i. e. it is linked as string, when IMEX=1 is set. Otherwise the data will be analyzed: the driver reads the first rows of a column, decides what data type the column would contain and tries to cast the rest of the data accordingly.

Microsoft Exchange / Microsoft Outlook

It is also possible to link Microsoft Excchange and Microsoft Outlook content into a database. This includes not only the basic folders like Inbox as well as all the folders you created in your Microsoft Outlook folder hierarchy, it encompasses personal files (PST), offline files (OST) and the personal address book (PAB) as well as your calendar and task list. The SQL for listing the content of the inbox of your Account would look as follows:

SELECT * 
  FROM [Object] 
    IN 'c:\temp\'[Exchange 4.0;MAPILEVEL=first.last@domain.tld|;TABLETYPE=0;]

Linking Microsoft Exchange / Microsoft Outlook content is unfortunately not as straight-forward as the aforementioned data types. Note that 'c:\temp' is the location where the linking will happen; any arbitrary existing path with user access will do.

The source type is Exchange 4.0, followed by  You have to figure out your account name from within Microsoft Outlook: sometimes it is just your e-mail address, sometimes it is an individual or technical name.

The pipe (|) after the account name is important, you will add the folder hierarchy of nested folders after the pipe. In case you should have more than one profile associated with your account, you can provide in the extended properties a parameter PROFILE.

Object Description
Inbox E-mail inbox
Calendar Calendar
Contacts? Contacts
Tasks? Task list

Options

Option Description
MAPILEVEL Name of the account. In case you are retrieving data from a lower level of your folder hierarchy, you will specify the folders in descending order separated by pipes. You can see this as the path specification to the destination folder with pipes as path separators.
PROFILE Mail profile; if you should have created more than one mail profile, you can specify the name of the profile that should be used here.
TABLETYPE The parameter TABLETYPE=0 (default) is used for folders, TABLETYPE=1 is used for address books.

Open Database Connection (ODBC)

Remote databases can be easily accessed, no matter whether they have a DSN or not. The latter is extremely usefull, whenever you need to connect to different databases programatically. The SQL statement for a selection from a database with a DSN looks as follows:

SELECT *
  FROM Database.Table[ODBC;DSN=MyDB;UID=user;PWD=password;]

Leightweight Directory Access Protocol (LDAP)

It is also possible to query a LDAP server for user information. This is not a OLEDB query like the other SQL statements. The SQL syntax would look like this:

SELECT Field1, Field2
  FROM 'LDAP://DC=ldap,DC=domain,DC=tld'
 WHERE objectCategory = 'person'
   AND objectClass    = 'user'
   AND Field1 = 'Criteria1'

Examples for valid fields are

Attribute Name
c Country code
cn Common name
co Country
company Company
department Department
displayName Display name
givenName First name
homeDirectory Home directory
homeDrive Home drive
info Info
l Location / city
mail E-mail
manager Manager
memberOf Membership
mobile Mobile
personalTitle Salutation
postalCode Postal code / zip
sn Last name
streetAddress Street
telephoneNumber Phone
title Title

Which attributes are actually used on the LDAP server needs to be checked prior to running any queries. Please also keep in mind that the SQL statement will be the slower the bigger the directory on the server is.

INSERT, UPDATE, DELETE

The analogous syntax applies to INSERT, UPDATE and DELETE statements.

It is, for example, possible to create a Microsoft Excel workbook without running any kind of VBA code by just executing an INSERT INTO statement into a specified workbook.

Depending on the type of source, you might also not be able to update or delete data.

Advantages

If you are familar with coding Microsoft Access databases with VBA you probably often ran into situations, where you had to retrieve data from other sources in order to work with it in your database. By creating pure SQL statements you spare out the part to import data and can link the desired data dynamically instead.

As you can assemble the SQL strings programmatically, you are able to realize batch processing without creating and deleting the corresponding tables and / or queries. For example, if you want to make a database, that tracks mails to certain topics for a multitude of users, you can easily link the inbox of the current user into your access database. The only alternative to this approach would be to manually create linked tables.

What is not working

There is a huge amount of connection strings, that you can use in programmatic database connections (DAO, ADO, see connectionstrings.com). It would be awesome, if every single of those connections could also be realized with Jet OLEDB - unfortunately it is not possible.

A good (actually sad) example is the connection to a SharePoint server. Basically most of the structures on a SharePoint server are lists, which is literally crying for being linked or imported into Microsoft Access and vice versa.

We can see from different connection string examples that the extended properties should look like WSS;HDR=NO;IMEX=2;DATABASE=https://sp.domain.tld/site;LIST={ABC};VIEW={XYZ};RetrieveIds=Yes. Unfortunately it is not possible (at least I did not find a way yet) to link the resources directly.

You can even see in the MSysImEx system table, that the parameters should work, but Jet OLEDB does not link the data and instead delivers error messages like "view not found".

Get started

It is a wide field to explore. I suggest, that you start with the basic types and try to link Microsoft Excel ranges and text files into your database and that you try to create workbooks with INSERT INTO statements.

If you found a way to link a resource into your database, check out the parameters in the hidden system table MSysImEx, write down the strings and parameters and try to create a query that links the data through a SELECT statement.

Reading

There are not many good resources on the net on this topic, but you might want to have a look at the following links:

  • http://ewbi.blogs.com/develops/2006/12/reading_html_ta.html
  • https://msdn.microsoft.com/en-us/library/aa201324(office.11).aspx
  • https://msdn.microsoft.com/en-us/library/aa140022(office.10).aspx
  • schema.ini
  • connectionstrings.com

Finally

If you should have found a resource that you were able to successfully link into your Microsoft Access database by means of a simple SELECT statement, please let me know. I will happily update this blog post.

Previous Post