Jump to content

JedAI Client 6.6.0 User Guide [KB]


Joel Branch
 Share

Recommended Posts

  • Employees

User Guide

The Lucd JedAI Client is downloaded locally on your device and interfaces with the Lucd Platform. The client enables users to visualize, transform, and prepare data for use in modeling frameworks (TensorFlow, PyTorch, Scikit-learn, etc.). Models can be uploaded and trained in the platform, which is touchscreen capable (not required).

System Requirements

The following specifications are required in order to run the client.

  • Windows or MacOS
  • 4 GB Memory
  • Modern CPU

Although not required, we recommend the following specifications in order to maximize the performance of the client.

  • A GPU to support accelerated rendering
  • 1600x900 display resolution minimum

Installation Instructions

The client is distributed via Lucd’s Steam Store.

A user is required to obtain a Steam account in order to access the client download.


Usage Instructions

Login

Log in to the client using the credentials provided to you.

login

  • Username
  • Password
  • Domain
    • Cloud customers will leave the domain field blank when logging in.
    • Private build customers will be provided a domain to use when logging in.
  • Login - Click to submit login credentials and enter the application.
  • New User - If this is your first time using Lucd, click here to register as a new user.

Register a new user

register

  • Generate password - Have Lucd suggest a password that meets the password requirements.
  • Password requirements - Hover to view the Lucd password requirements
    • Cannot reuse ANY old password.
    • 2 instances of all character classes.
      • Uppercase
      • Lowercase
      • Number
      • Special: !@#$%^&*()
    • No more than 2 characters from a class consecutive (123 is invalid).
    • No repeating characters (33 is invalid).
  • Register - Click to submit your details and return to the login screen.

After registering a new user, you may immediately login with that username.


Projects

Immediately after login, the Projects view is displayed. A project is a handy way to group artifacts based on data science problem.

projects

  • Available Projects – a list of all projects the logged in user has access to open.
  • Global Status – the status of all artifacts on the currently logged in system
  • Federated Status – the number of online/offline artifacts of the given type
  • Open Unallocated – begin using Lucd without an open project. Any artifacts created here will be saved as unallocated and still accessible from this button.
  • Search – search for a project by name or description
  • Grid/List view – change how the set of projects is displayed
  • Project – select this item to see more details about the project

project_details

  • Project Details – View the name, description and artifact counts for the selected project
  • Open Project – Open the currently selected project

Hovering over a project item will display the ‘Options’ menu.

projectoptions

  • Edit Details – change the name and description of the hovered project
  • Change Cover – select a meaningful cover photo (optional)
  • Show Details – View project details on right side of screen
  • Delete – Delete the project. Any artifacts allocated to that project will be moved to the ‘Unallocated’ space and can still be used.

Upon login, a user will see some form of this menu bar at the top of the screen, depending on which view is currently open.

navigation

  • Projects – close the currently open project and return to the ‘Projects’ view
  • Data – Click to go to the Workflow space. Hover and select from
    • Sources – view the sources visualization
    • Query Builder – go to the query building tool
    • Virtual Datasets – view a list of available virtual datasets for training
  • Assets – Click to view a list of available embedding for training
  • Modeling – Click to go to the ‘Modeling’ view to easily start a training run. Hover and select from
    • Models – view a list of available models for training
  • Federation – Click to view a list of currently connected federates and artifacts associated with each
  • Project name – the name of the currently open project
  • Federate – Hover to view the currently connected domain
  • Username – the currently logged in user
  • Minimize – Hide any open dockable panels to see the main view. Click again to unhide.
  • View Log – Click to view a list of status messages on the Lucd system
  • Settings – Click to edit various user and system settings

Sources

The Lucd client can show the user all available sources across the federation, as well as a data ingestion over time visualization.

sources

  • Sources table – list of all sources in federation
  • Federate indicator – hover over to see which federates contain the source. If the indicator is missing, then the source only exists on the logged in domain.
  • Refresh – refresh the displayed data
  • Ingestion over time viz – color represents relative number of records ingested during a given time period for a source. Click a box to zoom into that time period across all sources.
  • Back – Go up a time period (ex: Month to Year)

Data

After selecting a project, a user is taken to the Data Transform view, where queries can have EDA operations added to them and then built into a Virtual Dataset (VDS).

workflowsidebar

  • Saved Workflows – Each row represents a query that has been saved and is eligible to have EDA operations performed on it. Can be dragged and reordered on the ‘Active Workflows’ space.
    • Federate – hover over to see the federates of all VDS’ contained in the saved workflow. If it is orange, then at least one of the VDS has an issue on a federate. If this icon is not visible, then all VDS’ on that workflow only exist on the logged in domain
    • VDS – The number of VDS created from the given workflow
    • Workflow name – This will appear red if any operations within the workflow have returned an error. The error will go away once the operation has returned successfully.
    • Quick add – click to add the workflow to the 3D visualize space
  • Begin a new query – Click to build a query inside query builder so that it can be saved to the Transform space
  • Available operations – Click to begin adding an operation to a selected 3D node

Workflow 3D Space

data_workspace

  • Zoom – click and drag to zoom in and out of 3D space
  • Arrow to node – click to move selection to a different node. Can also use arrow keys.
  • Active Workflows – an ordered list of workflows currently displayed in 3D space. Can be dragged into a new order or clicked on to zoom to the root node of the selected workflow.
  • Selected node – Select any node to see additional options
  • Child node – children are displayed to the right of a parent node with lines connecting it.

Query node

querynode

  • Remove – remove the selected query and accompanying workflow from the 3D space
  • Delete – delete the selected query and accompanying workflow. This cannot be undone.
  • Edit – Reloads the query parameters into the query builder so it can modified and saved as a new query
  • Preview Data – Execute the query and visualize the results
  • Create VDS – Begins the process for creating a Virtual Dataset used for training

Operation nodes

opnodes

  • Operation name
  • Operation type
  • Delete – delete the selected operation and all downstream operations in workflow. This cannot be undone. Will not execute if there is a VDS downstream.
  • Preview Data – Execute the query all operations including this one and visualize the results.
  • Create VDS – Begins the process for creating a Virtual Dataset used for training

Virtual Dataset node

vdsnode

  • VDS Name
  • Delete – delete the selected VDS. This cannot be undone.
  • Preview Data – Execute the query all operations leading to this VDS and visualize the results.
  • Create Embedding - Only available with text models.
  • Merge VDS - Click and drag to another Virtual Dataset to merge them together.
  • Start training – Open the Modeling view to train with this VDS
  • Federate – hover over to see the federates holding this VDS. If it is orange, then at least one of the federates is returning an error with the VDS.

Once a node is selected in the data transform space, you may use the arrow keys to navigate quickly between adjacent nodes.

DataTransformNavigation

Collapsing nodes with double click

Double clicking a node will collapse all children downstream from that node and add a superscript next to it, indicating how many nodes were collapsed. This can be useful in large, spread out trees.

DataTransformCollapseNodes

Rearranging active workflows

Transform workflows in the active list can be rearranged in any order. This can be useful for comparing trees, or to bring two Virtual Datasets closer together to perform a merge.

DataTransformActiveWorkflowNavigation

Preparing Text Data for Model Training

Lucd provides special operations for easily preparing text data for model training, saving a model developer valuable time in manually coding routines for text transformation.

nlp

  • After creating an EDA tree based on a query of a text data source, a developer can add a new operation to the tree based on NLP operations as shown above.
  • NLP operations (e.g., stopword removal, whitespace removal, lemmatization) can be applied in any sequence.
  • It’s important to select the correct facet as the “text attribute.”
  • One can also elect to apply tokenization based on a document level (i.e., create one sequence of tokens for the entire facet value per record), or sentence level (i.e., create a token sequence per sentence in the facet for a record).

Saving VDS with Processed Text

When a developer wants to create a new virtual dataset including the transformed text data, they must choose the “processed_text” facet as the “sole” feature of the virtual dataset as shown below.

nlp_vds

Currently, Lucd does not support text model training incorporating multiple feature columns, only the “processed_text” facet must be selected.

Multi-column text model training will be supported in a future release.

Applying Custom Operations

Once custom operations have been defined and uploaded using the Lucd Python Client library, they are available in the GUI for usage in data transformation.

custom

As shown above, clicking on a custom operation will show further details, specifically the features the operation uses as well as the actual source code defining the op. As mentioned in the documentation for defining custom operations via the Lucd Python Client, one must select how to apply the operation based one of the following three Dask dataframe approaches:

Applying Image Operations

To apply image operations, select the Image Ops tab within the New Op menu in an EDA tree.

image_op

  • It’s important to select an image facet as the “Feature.”
  • The currently provided operations are as follows:
    Vertical and horizontal flips
    Grayscale Contrast normalization
    Normalize (0 mean and unit variance)
    Resize width & height
    Color inversion
    Crop borders
    Gaussian blur
    Rotate
    Min-max scaling
    To array (converts binary data to Numpy Array)
    Reshape dimensions

* Operations can be applied to percentages of a dataset instead of the entirety, and can also be used to augment existing data instead of operating in-place.


Query Builder

The Lucd client offers a unique and intuitive way to query data, giving a user flexibility in how complex queries are strung together to retrieve exact results.

querysidebar

  • Sources – a list of available sources to query. This can be dragged into the node editor window.
    • Quick add – click to add this source to the node editor window
    • Federate status – Hover to see which federates that hold the source. If this icon does not show, then the source only exists on the currently logged in domain.
  • Data Models – a list of available data models to query. This can be dragged into the node editor window.
    • Quick add – click to add this data model to the node editor window
    • View stats – click to view statistics of this particular data model
    • View features – click to view the features of this particular data model
  • Features – a list of features in this data model. This can be dragged into the node editor window
    • Quick add – click to add this feature to the node editor window
  • Federates – a list of available federates for filtering the query.
    • Note: the currently logged in domain will ALWAYS return results regardless if it is selected.

Node Editor Window

node_editor_window

  • Global search parameters – Click to view simple/advanced search filters
  • Zoom – drag this slider or use the mouse wheel to zoom in and out of the node view
  • Lucene syntax – a text representation of the search to be executed.
  • Copy Lucene syntax – click to copy the Lucene syntax. This can be pasted into the global search parameters to customize a search with features not supported by the node editor.
  • Search – Click to execute the search
  • Save – Save the search for use in Transform workflow.
    • Note that a search must be execute before it is saved.
  • Group – Toggle, then click and drag around a set of nodes to add a grouping around them. This acts as a set of parentheses in the Lucene syntax. This function can also be accomplished by holding Shift + Left click + drag
  • Refresh – Click to retrieve and repopulate the list of sources/data models/federates.
  • Exit – Close the query builder. Any unsaved progress will be lost.
  • Modify Node – Change node filter settings
  • Delete Node
  • Node connection dropdown – Click to select from AND/OR/XOR
  • Node connector – click and drag to connect to another node or grouping
  • Statistics – click to view statistics of last executed query

Advanced Search Parameters

advanced_search

  • All these words – search results must include all these words
  • Lucene query – add a Lucene query that will take the place of whatever is in the Node Editor Window
  • This exact phrase – search results must include this exact phrase
  • None of these words – search results must not have any of these words
  • Records per source/model - return this many records per source/model
  • Total records to return - return at least this many total records
  • Date range – search results must be from within this time period
  • Randomize – results should be returned in a random order
  • All Sources/Models - results should include a sample from every applicable source and data model

Search Results

search_results

  • Visualization panel – this will update with each search executed
  • Federate distribution – a bar chart showing how many records were returned from each applicable federate
  • Query statistics – each returned feature will show relevant statistics, and if applicable, a box plot to visualize.

Adding a node and changing connection logic

Nodes can be dragged into the workspace, or quickly added using the ‘+’ button on the left. The dropdown connecting two nodes or groups can be changed to AND/OR/XOR

QueryAddNodeChangeLogic

Grouping nodes

Nodes can be grouped together using the ‘Group’ toggle at the top or by holding shift and dragging. Groupings will add parentheses around the selected node in the Lucene output.

QueryGrouping

Manually connecting nodes

Nodes can be manually connected and disconnected by clicking and dragging either of the two circles on the side of a node/group. QueryNodeConnection


Visualization

General

viz_general

  • Query name – the name that was saved with the query
  • Record count – number of records returned out of number total records across system that fit query
  • Visualization selector – click each to change the visualization
  • Quick Add – click to add another visualization window of the same data slice
  • Maximize - click to expand the panel to full screen

Table

viz_table

  • Feature/column names
  • Histogram – lightweight visualization of numeric field distribution
  • Top/unique value – for string types only
  • Table row – click to see list of feature values
  • Paging controls – Go forward or backward in results

Scatterplot - 2D

viz_sc2

  • Axis selector – Select the axes from a list of available features
  • Filter knobs – Drag these knobs to adjust axis filter. Drag away from plot to reset the axis
  • Remove plot – Removes the plot from view
  • Add new plot – Adds a new plot to view

Box Plot

viz_boxplot

  • Feature selector – Select the feature from a list of available features
  • Remove plot – Removes the plot from view
  • Add new plot – Adds a new plot to view

Histogram

viz_histo

  • Feature selector – Select the feature from a list of available features
  • Filter knobs – Drag these knobs to adjust axis filter. Drag away from plot to reset the axis
  • Remove plot – Removes the plot from view
  • Add new plot – Adds a new plot to view

Scatterplot - 3D

viz_sc3

  • Axis selector – Select the axes from a list of available features
  • Filter knobs – Drag these knobs to adjust axis filter. Drag away from plot to reset the axis
  • Scatterplot point – Select to view details. Double click to focus in on that point
  • Reset view – Click to move camera back to starting view
  • Drag – Orbit around focal point
  • Mouse wheel – Zoom in/out
  • Shift + Drag – Pan camera
  • Ctrl + Drag – Look around

Parallel Coordinate Plot

viz_pcp

  • Feature selector – Select the feature from a list of available features
  • Add Feature – Click to add an additional feature to visualization
  • Remove Feature – Click to remove a feature from visualization
  • Reorder Features – Click and drag to reorder feature list
  • Maximum
  • Minimum
  • Feature name
  • Reset view – Click to move camera back to starting view
  • Drag – Orbit around focal point
  • Mouse wheel – Zoom in/out
  • Shift + Drag – Pan camera
  • Ctrl + Drag – Look around

Correlation Matrix

To see how each field relates to all the other fields, use a Correlation Matrix. Only numerical fields are displayed. Each bar is scaled on its y axis according to how its two contributing fields relate on a scale of –1 (red) to 1 (blue).

viz_corr

  • Feature name
  • Matrix bar - Click to see details about this specific feature pair.
  • Reset view – Click to move camera back to starting view
  • Drag – Orbit around focal point
  • Mouse wheel – Zoom in/out
  • Shift + Drag – Pan camera
  • Ctrl + Drag – Look around

Statistics

viz_statistics

Modeling

mod_left_sidebar

  • Models - Click and drag to training template model slot to begin training.
    • Model library (PyTorch, Tensorflow, XGBoost, SKLearn, Federated Learning)
    • Model framework (Simple, Advanced, Federated)
  • Virtual Datasets - Click and drag to training template VDS slot to begin training.
    • Federate status
  • Assets - Click and drag to training template asset slot when a text model has already been added to begin training
  • Show/Hide artifacts
  • Refresh data
  • Upload Model

Training Template/Parameters

mod_template

  • VDS Slot - Drag a VDS from the left sidebar to one of these slots to set it for that phase of training.
  • All VDS Slot - Drag a VDS here to set it to all three phases of training.
  • Model Slot - Drag a model here to set it for training.
  • Asset Slot - Drag an asset here when a text model has been selected to set it for training.
  • Training Name - Give the training a name to find it easier at a later time.
  • Default - Reset the value to the saved default.
  • Save Defaults - Save all current values as the new default values.
  • Reset all to defaults - Reset all changed values back to their saved defaults.
  • Clear saved defaults - Reset all saved defaults back to factory settings.
  • Training Parameters - Expand/Collapse parameters

Dragging components to training template

Models and Virtual Datasets can be dragged to the training template. Items in VDS slots can be rearranged.

Modeling_DragModelVDS

mod_trainings

  • Trainings - Click to see additional details.
    • Model library (PyTorch, Tensorflow, XGBoost, SKLearn, Federated Learning)
    • Status - If there is an error, click to see additional details.
  • Training Details
    • Start Train - Click to reload the training parameters to begin a restart
    • Delete training
    • Download - Click to download training artifacts as a .zip file
    • View Profile - Click to see the training profile
  • Show/Hide trainings

Modeling Graph

mod_graph

  • Model node
  • VDS node
  • Asset node
  • Training connector - Click to use these artifacts in a new training
  • Number of trainings - The number of trainings using this combination of artifacts

Training Profile

Performance Graph

prof_performance

  • Available Plots Selector - Choose from a list of selected graphs
  • Plot explanation - Get a description of the selected graph type
  • Update interval - How often the graph should update in seconds. Default 100.
    • Number of points displayed is limited to 1000 to keep updates consistent.
  • Line Toggle - Disable this value
  • Line Intersect - Click to freeze in place. Click again to unfreeze.

Confusion

prof_confusion

  • Interactable Square -Click a square to see details about actual and predicted values. Values only displayed in square if greater than 0
  • Show Records - Toggle box values between percentages and record counts
  • Histogram - Displays all predicted values for an actual value. Clicking a bar will update the table beneath it.
  • Table - A tabular view of sample results from the selected prediction.

Explainability Analysis

Lucd provides the ability to visualize “explanations” of a model’s output given specific inputs. Generally, explanations take the form of computed attribute weights, indicating the significance that an attribute gave to a model’s decision. This supports the ability to either debug a model or scrutinize the data fed to the model. This particular feature is supported by integration of the Lime framework. The figures below illustrate the explainability panel on the model profile view for various model types.

Explainability - Tabular/Regression

For analyzing a tabular model, the user enters sample(s) into the input text box as a list of lists of numbers, where each inner “list” is a single sample. Then click the “Explain” button underneath the box. The time required to run explanation analysis is dependent on the complexity of the model. Models with type tabular_classification or regression can explain tabular data predictions

exp_tabular

  • Input Array
    • Enter values to predict on. Must be valid JSON, as shown above
  • % of Training Data
    • Percentage of training data to build the explainer. Must be greater than 0 and less than or equal to 1
  • Number of Top Explanations
    • Positive integer denoting how many class explanations to show
  • Inputs
    • Colored to show how each influences top class prediction
  • Class Probabilities
    • Class predictions and corresponding likelihood
  • Explanation
    • How each input influences a positive or negative prediction

Explainability - Images

Models with type image_classification can explain image predictions

exp_image

  • Sample Image
    • Select local image to explain
  • Positive Only
    • If True, include only regions of the image contributing to the predicted label.
  • Hide Rest
    • If True, make the non-explanation part of the return image gray
  • Explanation
    • Returned colorized image with shaded regions of positive and negative influence. Red sections detract from the predicted class while green contributes positively to the predicted class.
  • Predicted Probabilities
    • Class predictions and corresponding likelihood

Explainability - Text

For text models, simply type the raw string you would like to have explained by your model. Models with type text_classification can explain text predictions.

exp_text

  • Input Text
    • Text the user would like to predict and explain
  • Output Text
    • Output text with class probabilities highlighted in positive (green) or negative (blue) colors
  • Predicted Probabilities
    • Class probabilites predicted
  • Explanation
    • Words that contribute to positive or negative correlation

Details

prof_details


Federated Lucd

federation

Release 6.5.0 introduces Federated Machine Learning to the Lucd platform. This capability introduces new features to the Lucd platform in order to support the development of federated models. Namely, if your Lucd platform is set up as part of a federation, many of the operations you perform within the JedAI client will automatically be federated. This includes:

  • Query: if your query matches data on multiple systems, you will get results from all of those systems.
  • EDA / search tree creation: saving your query into an eda tree will also create the eda tree on your other federates.
  • VDS: a vds created containing data from a federated query will in turn be created on all federates containing relevant data.
  • Model definition: model definitions uploaded you your JedAI GUI will also be created on other federates.
  • Training object: when training a federated model, a corresponding training object will be created on all participating federates.

Virtual Datasets

vds

  • Open transform - opens the transform workflow that created this VDS
  • Copy ID - useful for finding VDS via RES API calls
  • Create an embedding
  • Delete - Delete a VDS
  • Refresh - retrieves the latest VDS data

Assets

assets

  • Delete an embedding
  • Visualize - See embedding data on a PCA/TSNE chart
  • Refresh - retrieves the latest Asset data

PCA/TSNE

Embeddings can be viewed using PCA/TSNE techniques for visualization.

pca

  • Style - When viewing an embedding’s PCA/TSNE, click to see terms instead of points.
  • Region Select - Toggle to select a cluster of points using a bounding box.
  • Multiple Select - Use to add multiple bounding boxes.
  • Search - Search for a term. All matching terms will be highlighted, as well as shown in a list to the right until there is only one matching term.
  • Filter - Narrow the number of occurrences for a term to a range using.
  • Technique Select - Toggle between PCA and TSNE.

View full record

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

HELP & SUPPORT

ABOUT US

Lucd is an AI software platform company that supports multiple industry verticals, allowing for its users to build enterprise-ready AI solutions with Low Code / No Code development practices. Lucd supports the entire AI lifecycle, allowing for the secure fusing of structured and unstructured data, empowering data analysts as well as business professionals to work collaboratively, resulting in reduced time to uncover new opportunities and solutions.

×
×
  • Create New...