Directory and File Pathnames Reader Parameters
Quick Links |
|
---|---|
Path Parameters
Specifies the search pattern to use when looking for files and folders. Only filenames and folder names that match the Path Filter will be output as features.
This parameter is optional, but defaults to reading all contents of the current directory using a glob pattern of *. If not specified, the path will not attempt to be interpreted as a glob pattern. This allows special glob characters in the path to be treated as literals – all files, folders, or both, are returned from the reader dataset folder, depending on the setting for Allowed Path Type.
Note: The Path Filter is a glob pattern for matching file or folder names. Some limited name filtering is possible, but full regular expressions are not supported and the syntax is not identical. Recursion is performed using the parameter Recurse Into Subfolders.
The Path Filter allows these selections:
- a specific set of files and/or folders within the reader dataset folder by specifying a relative path; or
- files and/or folders from an entirely different path using an absolute path, in which case the reader dataset will be ignored.
For example, a dataset of c:/temp/ and a Path Filter of *.csv will return all files with a csv extension in the c:/temp folder, but – regardless of the reader dataset – specifying a Path Filter of c:/temp/*.csv will produce the same set of features. Note: Some expressions may not be interpreted correctly as absolute or UNC paths due to other characters in the pattern. |
Patterns are possible in both the reader dataset and in the Path Filter option. For better performance, it is recommended that patterns be used in the Path Filter instead of the reader dataset.
For example, you can set a dataset using a pattern such as c:/temp/**/*.csv without using the Path Filter option. However, you will see much faster results if you do the following instead:
c:/temp/
*.csv
Note: If you are using a Path Filter that is relative to the reader dataset, the trailing slash between the dataset and the path filter is optional and will be added if needed. |
Special Characters Supported in the Path Filter
Wildcard |
Description |
Example |
Matches |
Does not match |
---|---|---|---|---|
? |
Matches any single character case-insensitively. |
?at | Cat, cat, Bat, bat |
at |
* |
Matches any sequence of zero or more characters case-insensitively. This match can occur in folder names, file names or file extensions, and can be used multiple times. |
Law* | Law, Laws, LawS, Lawyer | GrokLaw |
[abc] |
Matches a single character case sensitively | [CB]at | Cat, Bat |
cat or bat |
[a-z] | Matches any single character in the range a-z inclusive and case-sensitively | [a-z]001 | a001 or b001 etc. | A001,a002 |
[0-9] | Matches any single number in the range 0-9 inclusive. | Letter[4-5] | Letter4, Letter5 | Letters, Letter, Letter1 |
[a-zA-Z] | Matches any single character in the range a-z or A-Z inclusive and case-sensitively. | test[a-zA-Z] | testAB, testab, testAz, or testZa | test |
{ab, cd, e} |
Matches any of the strings ab or cd or e. | Dir{One,Two} | DirOne, DirTwo |
DirThree, DirOneTwo |
\\machine\dir | Matches an absolute UNC network path. | \\comp\temp |
Notes:
- The Path Filter is a glob pattern – not a regular expression.
- If the value supplied is a relative path, it will be respective to the dataset.
- If the value supplied is an absolute or UNC path, it will be treated as such. Note that some expressions may not be interpreted correctly as absolute or UNC paths due to other characters in the pattern.
- Using forward slashes as separators will provide the best results because forward slashes do not conflict with glob escape characters or UNC path names.
See also Common Path Filter Errors.
Additional Examples Relative to the Reader Dataset
*.dgn |
Matches all files in the reader dataset folder that end with a .dgn extension. |
{data,archive}/*.dgn |
Matches all files in the reader dataset folder in either the data or archive subfolders that end with a .dgn extension. |
data/{d,p}*.shp |
Matches all files in the reader dataset folder that begin with a d or a p and that end with a .shp extension. The braces must be contained within a single path component, and cannot contain path separators. |
/data/92?034.dgn |
Matches all files in the data subfolder of the reader dataset folder that start with 92, have any single letter or number character, and end with 034.dgn. |
92[a-z]034.dgn |
Matches all files in the reader dataset folder that start with 92, then any single lowercase letter, and end with 034.dgn. |
data/**/*.shp |
Matches all files in the data subfolder of the reader dataset folder with a .shp file extension. |
Additional Absolute Path Examples
C:/data/*.dgn |
Matches all files in the c:/data folder that end with a .dgn extension. |
C:/{data,archive}/*.dgn |
Matches all files in the c:\data and c:\archive folders that end with a .dgn extension. |
How to access the Path filter after initial workspace generation.
Error |
Reason |
---|---|
C:\data\*.dgn |
Single backslashes are interpreted as escape characters in the pattern. You can use forward slashes for path separators (C:/data/*.dgn), or escape the backslashes so they are treated as literals (C:\\data\\*.dgn). |
\\myfolder\*.csv |
If myfolder is a folder instead of a host, this will fail. Instead, use forward slashes for UNC paths and omit the leading separator for relative paths. For example: //myhost/*.csv myfolder/*.csv |
C[:]/*.txt |
Specifying the glob pattern syntax can sometimes conflict with the path interpretation. In this case, the optionality of the colon character prevents the recognition of the absolute path drive letter. You can try different combinations of the dataset and filter – these combinations may or may not succeed, depending on the conflict. |
C:[data] |
Specifying the glob pattern syntax can sometimes conflict with special characters in the path. In this case, the directory contains square brackets [] which, by default, will be misinterpreted as a glob pattern. To ensure it is read correctly, remove the default glob pattern asterisk * and leave the Path Filter empty. This will disable glob interpretation and the path will be interpreted literally. |
Specifies whether to match the current folder and, recursively, all subfolders.
If set to Yes, the path filter glob pattern will be prepended with **/, a wildcard that indicates recursion into subfolders.
Note: We recommend that users do not use the recursion wildcard directly in the Path Filter, but instead set this option to Yes if matching all subfolders is desired.
Specifies whether to search for any path type, which includes both files and folders, or specifically files or folders only.
Specifies whether to populate file properties for each matched file or folder returned.
If set to Yes, the attributes in the table below will be on schema and set on output features for files and folders with the corresponding timestamps, file size, owner name, and read-only attributes.
See the Feature Representation section for information about each specific attribute.
If set to No, the attributes will not be on schema or created. Setting this parameter to No can sometimes improve performance.
File Properties Attributes |
---|
path_modified_date |
path_accessed_date |
path_created_date |
path_filesize |
path_ownername |
path_readonly |