Directory and File Pathnames Reader Parameters
Quick Links |
|
---|---|
Path Parameters
Specifies the search pattern to use when looking for files and folders. Only filenames and folder names that match the Path Filter will be output as features.
This parameter is optional, but defaults to reading all contents of the current directory using a glob pattern of *. If not specified, the path will not attempt to be interpreted as a glob pattern. This allows special glob characters in the path to be treated as literals – all files, folders, or both, are returned from the reader dataset folder, depending on the setting for Allowed Path Type.
Note: The Path Filter is a glob pattern for matching file or folder names. Recursion and some limited name filtering is possible, but full regular expressions are not supported and the syntax is not identical.
The Path Filter allows these selections:
- a specific set of files and/or folders within the reader dataset folder by specifying a relative path; or
- files and/or folders from an entirely different path using an absolute path, in which case the reader dataset will be ignored.
For example, a dataset of c:/temp/ and a Path Filter of *.csv will return all files with a csv extension in the c:/temp folder, but regardless of the reader dataset specifying a Path Filter of c:/temp/*.csv will produce the same set of features. Note: Some expressions may not be interpreted correctly as absolute or UNC paths due to other characters in the pattern. |
Patterns are possible in both the reader dataset and in the Path Filter option. For better performance, it is recommended that the glob pattern be used in the Path Filter instead of the reader dataset.
For example, instead of using a dataset of c:/temp/**/*.csv and not using the Path Filter option, use a dataset of c:/temp/ and a Path Filter of **/*.csv for much faster results. Note if using a Path Filter that is relative to the reader dataset, the trailing slash between the dataset and the path filter is optional and will be added if needed. |
Special Characters Supported in the Path Filter
Wildcard |
Description |
Example |
Matches |
Does not match |
---|---|---|---|---|
? |
Matches any single character case-insensitively. |
?at | Cat, cat, Bat, bat |
at |
* |
Matches any sequence of zero or more characters case-insensitively. This match can occur in folder names, file names or file extensions, and can be used multiple times. |
Law* | Law, Laws, LawS, Lawyer | GrokLaw |
** For performance reasons, it is recommended that you use this option in the Path Filter instead of the reader dataset. |
Matches the current folder and recursively all subfolders. |
**\Law\* | Temp\Law\a, Temp\Law\b, Temp2\Law\A, Temp2\Law\B |
Temp\Junk\a |
[abc] |
Matches a single character case sensitively | [CB]at | Cat, Bat |
cat or bat |
[a-z] | Matches any single character in the range a-z inclusive and case-sensitively | [a-z]001 | a001 or b001 etc. | A001,a002 |
[0-9] | Matches any single number in the range 0-9 inclusive. | Letter[4-5] | Letter4, Letter5 | Letters, Letter, Letter1 |
[a-zA-Z] | Matches any single character in the range a-z or A-Z inclusive and case-sensitively. | test[a-zA-Z] | testAB, testab, testAz, or testZa | test |
{ab, cd, e} |
Matches any of the strings ab or cd or e. | Dir{One,Two} | DirOne, DirTwo |
DirThree, DirOneTwo |
\\machine\dir | Matches an absolute UNC network path. | \\comp\temp |
Notes:
- The Path Filter is a glob pattern and not a regular expression.
- If the value supplied is a relative path, then it will be respective to the dataset.
- If it is an absolute or UNC path, then it will be treated as such. Note that some expressions may not be interpreted correctly as absolute or UNC paths due to other characters in the pattern.
- Using forward slashes as separators will provide the best results because forward slashes do not conflict with either glob escape characters or UNC path names.
Additional Examples Relative to the Reader Dataset
*.dgn | Matches all files in the reader dataset folder that end with a .dgn extension. |
**/*.dgn | Matches all files in the reader dataset folder and any subfolder below it that ends with a .dgn extension. |
{data,archive}/*.dgn | Matches all files in the reader dataset folder in either the data or archive subfolders that end with a .dgn extension. |
/data/92?034.dgn | Matches all files in the data subfolder of the reader dataset folder that start with 92, have any single letter or number character, and end with 034.dgn. |
92[a-z]034.dgn | Matches all files in the reader dataset folder that start with 92, then any single lowercase letter, and end with 034.dgn. |
Additional Absolute Path Examples
C:/data/*.dgn | Matches all files in the c:/data folder that end with a .dgn extension. |
C:/data/**/*.dgn | Matches all files in the c:/data folder and any subfolder that end with a .dgn extension. |
C:/**/*.dgn | Matches all files on the entire C: drive that end with a .dgn extension. |
C:/{data,archive}/*.dgn | Matches all files in the c:\data and c:\archive folders that end with a .dgn extension. |
How to access the Path filter after initial workspace generation.
C:\data\*.dgn |
Single backslashes are interpreted as escape characters in the pattern and must be escaped themselves to be treated as literals. For example: C:/data/*.dgn |
\\myfolder\*.csv |
If myfolder is a folder instead of a host, this will fail. Instead, use forward slashes for UNC paths and omit the leading separator for relative paths. For example: //myhost/*.csv myfolder/*.csv |
C[:]/*.txt |
Specifying the glob pattern syntax can sometimes conflict with the path interpretation. In this case, the optionality of the colon character prevents the recognition of the absolute path drive letter. You can try different combinations of the dataset and filter – these combinations may or may not succeed, depending on the conflict. |
C:[data] |
Specifying the glob pattern syntax can sometimes conflict with special characters in the path. In this case, the directory contains square brackets [] which, by default, will be misinterpreted as a glob pattern. To ensure it is read correctly, remove the default glob pattern asterisk * and leave the Path Filter empty. This will disable glob interpretation and the path will be interpreted literally. |
C:\\**\\*.dgn |
** glob expansions only work with forward slashes. To avoid this issue, use forward slashes for path separators. For example: C:/**/*.dgn |
Specifies whether to search for any path type, which includes both files and folders, or specifically files or folders only.
Specifies whether to populate file properties for each matched file or folder returned.
If set to Yes, the attributes in the table below will be set on output features for files and folders with the corresponding timestamps, file size, owner name, and read-only attributes.
See the Feature Representation section for information about each specific attribute.
If set to No, the attributes will still exist but with their values set to the empty string “”. Setting this parameter to No can sometimes improve performance.
File Properties Attributes |
---|
path_modified_date |
path_accessed_date |
path_created_date |
path_filesize |
path_ownername |
path_readonly |