Metadata Filtering in a Worksheet
A filter panel can be presented to the left of the results. Talend Data Catalog provides a sophisticated set of post filtering options.
You may filter to only those results which have specific values in specific properties.
Filter Preferences
You may specify filter Preferences including:
- Apply Filters Automatically – When checked, if you update or define a new filter, the results are updated automatically. Otherwise, the an APPY button appears in the UI and you must click this button to update the results.
- Ignore Connection Objects – not show objects which are part of connection definitions, such as ETL sources and targets, or BI sources. These are not a part of the actual data store models (databases, files, etc.) and are often not important for data analysis.
Ignore Connection Objects only removes from the search results those matching objects which are actually contained inside connection models. Simply because the word “connection” is in the object type name does not mean they are a part of the connection models. E.g., objects of type, connectioncolumn are not removed from the search results.
Apply Filters Automatically or Manually
Depending upon the worksheet Apply Filters Automaticallypreferences settings, worksheets will either:
- Apply Filters Automatically is checked – Every action, such as to update the filter, add a column set preferences, etc., will cause the worksheet to re-run the query
- Apply Filters Automatically is un-checked – You must click the APPLY button (which is only available in this mode) to re-run the query. The button is highlighted when a refresh of the query would produce new results.
Depending upon how your group preferences are configured, you may not be able to adjust the Apply Filters Automatically.
Filter by example by column
You may always filter the results by what is already presented. E.g., if one wishes to only see columns for Type View Column, you may simply right-click a cell in the grid that contains that value and select that option.
Using the Filter Panel
You may also expand the Filters panel on the left in order to limit the results based upon advanced matching criteria.
Common Filter Criteria
Filter Criteria | Description |
Common Criteria | |
Business Description | Business Description of the object |
Business Description Inferred | The business description inferred from semantic flow analysis |
Business Name | Business name of the object |
Business Name Inferred | The business name inferred from semantic flow analysis |
Certified | Filter by certification status |
Certified by | Only those results certified and/or certified by a particular individual or set of individuals |
Collections | Filter by collection membership |
Comment Count | Only those results commented according to a particular count |
Commented By | The list of users who commented on the object |
Model | Only those results in particular Models |
Created By | The user who created an object of authored content including custom model objects (e.g.glossary terms). Note that this attribute does not apply to imported model objects. |
Created Date | The date when an object of authored content was created, including custom model objects (e.g. glossary terms). Note that this attribute does not apply to imported objects (e.g, tables, columns) except the repository model object itself. |
Data Classifications Data Classifications Approved Data Classifications Matched Data Classifications Rejected | Filter by data classes associated with the results. |
Documentation | Documentation of the object |
Endorsed by | Only those results endorsed and/or endorsed by a particular individual or set of individuals |
Endorsement Count | Only those results endorsed according to a particular count |
External URL | A link to an imported object in its original web based UI |
Has Data Impact | Filter by existence of objects in the data impact trace |
Has Data Lineage | Filter by existence of objects in the data lineage (source) trace |
Has Semantic Definition | Filter by existence of objects in the semantic definition trace |
Has Semantic Usage | Filter by existence of objects in the semantic usage trace |
Inferred Documentation | Inferred documentation of the object which has been inferred from semantic flow analysis |
Labels | Only those results tagged by particular labels |
Mapped Documentation | Mapped documentation of the object |
My Workflow Tasks | The list of workflow actions that can be performed by the current user on the object. This attribute only applies to models where workflow has been enabled. See the available workflow actions |
Object Id | The internal object ID |
Object Profile | Object profile used |
Object Stable Id | Public id for the object that is independent of which version(s) contain it |
Object System Type | Filter by the internal system type of object. |
Object Type | The display type of the object |
Parent Object Name | Filter by the name of parent object, e.g., all tables in the “dbo” schema |
Parent Object Type | Filter by the type of parent object, e.g., all objects in a schema |
Sensitivity Label | Filter by the current sensitivity label association |
Sensitivity Label Approved | Filter by the current sensitivity label if approved |
Sensitivity Label Lineage Proposed | Sensitivity label inferred from other objects using data flow analysis |
Sensitivity Label Data Proposed | Sensitivity label detected during the data classification process and thus associated with this object |
Sensitivity Label Rejected | Filter by the current rejected sensitivity labels, including all labels which were rejected and will not be assigned automatically as a part of data classification. |
Search Text | Text to match against. One may select which properties (e.g., name vs. description) to match against. |
Updated by | The user who last updated an object of authored content, including custom model objects (e.g.glossary terms). |
Updated Date | The date when an object of authored content was last updated, including custom model objects (e.g.glossary terms). Or for imported objects, it is the last date the repository object was updated, not the last date the object was updated in the source system. |
Viewed Date | The lasty time the current user visited the detail page of the object |
Warned by | Only those results warned and/or warned by a particular individual or set of individuals |
Warning Count | Only those results warned according to a particular count |
Watchers | Filter by users assigned as watchers for the object |
Workflow Assignees | Identifies which users can perform the next step of the workflow. This attribute is applicable only to models where workflow has been enabled. See the available workflow actions |
Workflow Deprecation Required | Filter by the status Workflow Deprecation Required, indicating that this object is to be deprecated |
Workflow Published | Filter by the status Workflow Deprecation Required, indicating that this object is to be deprecated |
Workflow State | The current state which the object is in (Draft, Under Review, Published, etc.). This attribute only applies to models where workflow has been enabled. |
Additional Criteria Groupings | |
DATA PROFILING | Filter by various data profiling attributes |
OBJECT ATTRIBUTES | Filter by the attributes of the objects filtered by category, including custom attributes |
OBJECT RELATIONSHIPS | Filter by the relationships of the objects. |
OBJECT ROLES | Filter by responsibility assignment, including Steward |
Use +FILTER to add addition filter criteria, criteria, click the X next to a specific filter to clear that one, or double-click on the name of the filter to clear it.
You may always filter the results by what is already presented. E.g., if one wishes to only see columns for Object Type of View Column, you may simply right-click a cell in the grid that contains that value and select that option.
Special Columns Which May Not be Used in Filters
These are documented in the Developer’s Guide as it is a fundamental feature of Metadata Query Language (MQL). Obviously, they will not appear in the list of possible filters as a result of the +FILTER button.
Remove a Filter Criteria from the Worksheet
Click the X that appears in the upper right corner of the criterion in the Filter panel.
Define the Order of the Filter Criteria
To change the order of a particular filter criterion click on the Configure icon that appears in the upper right corner of the criterion in the Filter panel and select Move Up or Move Down.
Filter by Numeric Attributes
Filtering criteria may be based upon the <,<=, >, >= or BETWEEN (Note: NOT is not supported) operators on all numeric data profiling filters and all integer attribute types.
Steps
- Open a worksheet by any of the create methods detailed, or go to
- WORKSHEETS > Manage,
- WORKSHEETS > RECENT >,
- WORKSHEETS > FAVORITES >.
- Click +FILTER and add a numeric column.
- Specify the criteria, <,<=, >, >= or BETWEEN
- Click outside the dialog box.
Example
Create a worksheet (search) for Fields in Files.
Click +FILTER and select Data Profiling Distinct, specify less than and specify 6.
Filter by Search Text
You may enter text to filter by, and the text will be matched to the properties specified, as follows.
Steps
- Open a worksheet by any of the create methods detailed, or go to
- WORKSHEETS > Manage,
- WORKSHEETS > RECENT >,
- WORKSHEETS > FAVORITES >.
- Click the Search Text and enter the search text.
- Optionally, specify Search Options, including
- What attributes to search within for the text
- Include Semantic Search
- Obtain Help on the search
- Click outside the dialog box.
Example
Open a worksheet by clicking the Search box in the upper right corner, entering “Invoice” and pressing ENTER. Click Search Options
Filter by Category
You may specify from a hierarchical pick list which object types to include in the results. Thus, you may include, e.g., both Terms and Database Columns in the search results.
Steps
- Open a worksheet by any of the create methods detailed, or go to
- WORKSHEETS > Manage,
- WORKSHEETS > RECENT >,
- WORKSHEETS > FAVORITES >.
- Click the pull-down under CATEGORY and check the box for one or more categories of object types.
- Click outside the dialog box.
Example
Create a worksheet and click the pull-down under CATEGORY and check the box for Fields in Files.
More Information
The list of categories includes:
- System Objects – E.g., Worksheets, Dashboards, Collections, Groups, Users
- Import Model types – E.g., File, Database, Data Modeling, Tableau, InformaticaPowerCenter, etc.
Dataset is a special category that includes all the different types of data set categories, like Database, File, JSON Database, Data Modeling, etc., so that one may refer to Data Attributes generically, and not have to specify all the types of attributes.
- Custom Models and Objects – e.g., Glossaries, Data Mappings, Business Rules, etc.
Filter by Model
Use this filter to restrict the results to only those which are contained within particular models (e.g., a specific imported model or glossary).
In addition, if there is a directory structure in a model (e.g., a database import containing multiple schemas or a repository import containing directory structure, models, connections, etc.) then one may also use checkboxes to filter the results.
Steps
- Open a worksheet by any of the create methods detailed, or go to
- WORKSHEETS > Manage,
- WORKSHEETS > RECENT >,
- WORKSHEETS > FAVORITES >.
- Click +FILTER and select Model.
- Click in the dropdown box and select one or more models in which the results will be included.
- Click outside the dialog box.
Example
Create a worksheet and filter on the text “Invoice”. Then, click +FILTER and select Model.
The faceted search is intelligent first suggesting the models with the largest number of matching potential results. The number of these results is displayed next to the Model.
Click SHOW MORE.
Now, the list is complete (up to 100 items).
Filter by Updated Date and Updated By
Filter to only those results which have an Updated Date value in a specific date range.
Steps
- Open a worksheet by any of the create methods detailed, or go to
- WORKSHEETS > Manage,
- WORKSHEETS > RECENT >,
- WORKSHEETS > FAVORITES >.
- Click +FILTER and select Updated Date.
- Click on the filter condition, including:
- equals
- not equals
- today
- yesterday
- last hour
- last 7 days
- last 60 days
- last 90 days
- last year
- before
- after
- between
- Specify a value for the condition.
- Click outside the dialog box.
Example
Create a worksheet and filter on the text “Invoice”. Then, click +FILTER and select Updated Date.
There are several date type fields, all work the same way in worksheets.
Click the filter Condition (equals by default).
Select last 7 days.
The results are empty if you have not updated in the last 7 days any of the items that would be hit by the other Filter criteria. In this case the search text “Invoice”.
Dates in Worksheets
See Dates in the Product.
Filter by Object Roles
Filter by Object Role responsibility assignment (to users or groups), including Steward. Results then then include all the matching objects which have the particular responsibility assignment specified in the Filter.
Steps
- Open a worksheet by any of the create methods detailed, or go to
- WORKSHEETS > Manage,
- WORKSHEETS > RECENT >,
- WORKSHEETS > FAVORITES >.
- Click +FILTER and scroll down to the OBJECT ROLES section.
- Select a specific role.
- Click on the filter condition, including:
- equals
- not equals
- equals any
- not equals any
- equals all
- not equals all
- exists
- not exists
- Specify a value for the condition.
- Click in the dropdown box and select one or more users and/or groups that the object role is assigned to.
- Click outside the dialog box.
Example
Create a worksheet and filter on the text “Invoice”. Then, click +FILTER and select Data User.
Select exists as the filter condition.
The results are mostly objects contained within a model that is assigned as Data User to any user or group.
E.g., click the model Staging DW in which the first result (Invoice) is located.
Go to the Responsibilities tab.
This model is indeed assigned the Data User object role to Arif Anar and thus all its contained objects (like Invoice) are included in the results that match the other search filters (namely, Search Text “Invoice”).
Filter by Stewardship
In order to filter by stewardship, you must filter by object roles based upon responsibility assignment.
Stewardship is based upon responsibility assignment of the Stewardobject role. Thus, to see all the objects with stewardship assignment for a user, select Steward object role and pick the user you want.
Filter by Label
Filter to only those results which have one of the selected labels assigned.
Steps
- Open a worksheet by any of the create methods detailed, or go to
- WORKSHEETS > Manage,
- WORKSHEETS > RECENT >,
- WORKSHEETS > FAVORITES >.
- Click +FILTER and select Label.
- Select one or more LABLES or All.
Example
Create a worksheet and filter on the text “Invoice”. Then, click +FILTER and select Label.
There are several types of labels in the system. In particular, you may filter by Sensitivity Label and Conditional Labels. Conditional labels work just like labels in terms of filtering.
The faceted search is intelligent first suggesting the models with the largest number of matching potential results. The number of these results is displayed next to the Model.
Click QuarterlyReview in the faceted search filter and only three items are displayed.
Filter by Curation Status
You may endorse, certify, warn and comment on any object.
You may then filter to only those results which have any of these or any of these from a specific set of individuals.
Steps
- Open a worksheet by any of the create methods detailed, or go to
- WORKSHEETS > Manage,
- WORKSHEETS > RECENT >,
- WORKSHEETS > FAVORITES >.
- Click +FILTER and scroll down to Certified.
You may also filter by Certified by, Warned by, Endorsed by or Commented by in order to Filter on the person who made the curation action.
You may also filter by Warning Count, Endorsement Count or Comment Count in order to Filter on the number of those curations notations made.
- Click on the filter condition, including:
- equals
- not equals
- Specify a value for the condition.
Example
Create a worksheet and filter on the text “Customer”. Then, click +FILTER and select Certified.
Select equals as the filter condition.
Select True for the value of the condition.
Filter by Lineage Status
Object in a configuration may have (or may not have) data flow and/or semantic lineage. You may then filter to only those results which have any of these or do not have lineage of a particular type.
You may use any of the four filter criteria:
Filter Criteria | Description |
Has Data Impact | Filter by existence of objects in the data impact trace |
Has Data Lineage | Filter by existence of objects in the data lineage (source) trace |
Has Semantic Definition | Filter by existence of objects in the semantic definition trace |
Has Semantic Usage | Filter by existence of objects in the semantic usage trace |
You may specify:
- Exists – Filters to all objects for which the filter can apply (classifier and feature level objects)
- Not Exists - Filters to all objects for which the filter does not apply (classifier and feature level objects)
- Equals (True/False) – Filters to all objects which have or do not have the lineage.
Example
Search for “Invoice” to create a worksheet. Click +FILTER and select Has Data Impact.
Select equals and True for the condition. Also, it could be useful to add the Has Data Impact column.
Detection of Unused Data Elements in the Architecture
Asking the question the other way, i.e., what objects do not have data impact and are also in the data warehouse, will show you all the data elements which are not being used.
Click +FILTER and select Has Data Impact.
Select Equals and False for the condition.
Use the Model filter to only search Staging DW.dbo.
Also, it could be useful to add the Has Data Impact column.
Filter by Sensitivity Label
You may filter by the data classifications which are associated (tagged) on an object. These filters include:
- Sensitivity Label – The current sensitivity label associated with the object
- Sensitivity Label Approved – The current approved (or manually assigned) sensitivity label associated with the object
- Sensitivity Label Lineage Proposed – inferred from other object using data flow analysis
- Sensitivity Label Data Proposed – detected during the data classification process
- Sensitivity Label Rejected – All labels which were rejected and will not be assigned automatically as a part of data classification.
You may filter by the data classifications which are associated (tagged) on an object. These filters include:
- Data Classifications – Data classes associated with the object. The attribute shows both approved (or manually assigned) data classes as well as proposed ones.
- Data Classifications Approved – Data classes approved for (or manually assigned to) the object.
- Data Classifications Matched – Data classes detected during the data classification process and thus associated with the object.
- Data Classifications Proposed – Data Classes proposed during the data classification process but are not yet approved or rejected.
- Data Classifications Rejected – Data classes which were rejected for the object.
All of the following data profiling result properties may be used as filters and as columns in a worksheet:
- “Data Profiling”.”Distinct”
- “Data Profiling”.Duplicate
- “Data Profiling”.Valid
- “Data Profiling”.Empty
- “Data Profiling”.Invalid
- “Data Profiling”.Min
- “Data Profiling”.Max
- “Data Profiling”.Mean
- “Data Profiling”.Variance
- “Data Profiling”.Median
- “Data Profiling”.”Lower Quantile”
- “Data Profiling”.”Upper Quantile”
- “Data Profiling”.”Avg Length”
- “Data Profiling”.”Min Length”
- “Data Profiling”.”Max Length”
- “Data Profiling”.”Inferred Data Types”
You may filter by which collections each object in the result is a member of.
You may specify multiple collections that the results may be a member of.
You may specify:
- equals: If any of the collections contain an object it is included in the results (same as equals any)
- not equals: If any of the collections contain an object it is not included in the results
- equals any: If any of the collections contain an object it is included in the results (same as equals)
- not equals any: If any of the collections contain an object it is not included in the results (same as not equals)
- equals all: If any of the collections contain an object it is included in the results
- not equals all: If any of the collections contain an object it is included in the results
- exists: If any of the collections contain an object it is included in the results
- not exists: If none of the collections contain an object it is included in the results
You may filter by the existence or content of comments just as you may do so for curations such as endorsement and warning.
Filter by Object Image
The object image is a single image which may be added to the object page Overview tab optionally.
The filter option is object image.
By default, it is not included in the presentation, so you much first add it to the presentation to see the option.
As this is not an indexed search, it will be slow. Please be sure to filter by category, model and any other criteria that will reduce the scope of the search.
Filter by Relationship
As we can see from the MANAGE > Metamodelpage, there is an association between the two different models for business policies and business rules.
We populated these in the Custom Models section, with worksheet examples there.
Filter by Data Documentation/Cataloging
As an effective tool to:
- Identify documented data elements
- Identify those data elements which still require documentation
You may define a worksheet which shows term defined, locally defined, mapped and inferred documentation (i.e., Name and Business Definition) for an imported object, e.g., the Fields of a data lake.
Please see examples in the Data Documentation section, such as an inferred documentation worksheet.
Filter by System Objects
As you may now include “system” objects, including:
- Worksheets
- Dashboards
- Collections
- Groups
- Users
in your worksheet searches, you may filter a worksheet by these objects. The objects may now be included in the Metadata Query Language (MQL) query.
Example
Open a new worksheet and under CATEGORY > System > Groups.
Metadata Query Language and Query Filters
MQL allows users to define powerful and complex metadata queries with a familiar SQL syntax.
The MQL is available through the REST API and constitutes the foundation of the search / worksheet page. For syntax details, refer to the Developer Guide.
You may use the worksheet Preferences to toggle showing or hiding the metadata query language (MQL) query.
Anything you do in the Filters is reflected in the MQL. You may also edit the Filters from the perspective of the MQL directly by using the Advanced model.
Steps
- Open a worksheet by any of the create methods detailed, or go to
- WORKSHEETS > Manage,
- WORKSHEETS > RECENT >,
- WORKSHEETS > FAVORITES >
- In the Results panel, click Preferences > SHOW CURRENT QUERY in the worksheet Preferences.
- Click +FILTER and +QUERY to add MQL directly as additional filters.
You may add any number of query filters. You may toggle them on and off. They become a saved part of any worksheet you save.
Example
Go to WORKSHEETS > Manage > Certified Terms.
Click SHOW CURRENT QUERY in the Worksheet results preferences.
Click +FILTER and +QUERY.
Click EDIT under the MQL FILTER, and build the query for Comment Count greater than or equal to 1.
Click OK.
Name the filter “Highly Commented”.
We now have a reusable MQL query filter, which may be referenced as “Highly Commented” in other worksheets.
Did this page help you?
If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!