SAS Demo | SAS Visual Text Analytics Demo (Version 8.4)

SAS Demo | SAS Visual Text Analytics Demo (Version 8.4)


MARY BETH MOORE:
Unstructured text is the largest
human-generated data source and it grows exponentially
by the minute. SAS Visual Text Analytics
augments human efforts to analyze unstructured
text data with artificial intelligence. This is achieved through a
variety of modeling approaches that combine the power
of natural language processing, machine learning,
and linguistic rules to help people find
the information they need when they need it. To start preparing the data, SAS
provides pre-defined concepts that extract and
classify elements of text into predefined groups such as
names of persons, locations, expressions of time,
percentages and more. You don’t have to manually
capture the variety of ways people may express
time or values. The work has already
been done for you. SAS also allows you to
build custom concepts using LITI syntax. LITI is short for
language interpretation for textual information and is
a propriety programming language that is powerful,
flexible and scalable. Before analyzing
large volumes of text, it’s important to break the
data into chunks and provide the human framework the machine
needs to analyze at scale. Parsing separates text
into its words, phrases, punctuation marks and
other elements of meaning. The majority of these
actions are rule-based, leveraging linguistic expertise
for the specific language. SAS provides parsing actions
as out-of-the-box functionality for 33 languages. In the text parsing node,
I can see all of the terms that the machine
recommends I keep or drop. I have complete
control over what terms I want to keep for analysis. SAS automatically
performs parsing and part-of-speech detection
and resolves all word forms, misspellings, and synonyms
under their parent terms. A term map provides a
visual representation of the most commonly
found words and phrases associated with
the selected terms across the entire collection. You can also view a list
of similar terms, which are other words commonly
found near one another within the document collection. This is a great way to get
ideas for additional keywords to build into your category
or concept definitions. Natural language processing and
unsupervised machine learning can help reveal trends in data
by automatically extracting terms and topics that appear
in correlation to each other throughout a set of documents. This allows you to quickly
discover trends in your data without having to
know explicitly what to look for ahead of time. Topics are automatically
generated by the machine, but SAS Visual Text Analytics
gives you the ability to split or merge topics as you see
fit allowing you to refine and build upon the
machine’s output. Sentiment analysis is
intuitive across topics with a visual depiction of
positive, negative or neutral sentiment. You also have the
ability to promote topics to categories, which
will automatically convert these statistically
discovered topics into Boolean rules that
you can alter if you wish to fine tune the results more. Human-designed systems
of categorization can also organize data to
uncover trends and patterns across documents. This allows you to produce
a binary outcome – either the text matches the
rule or it doesn’t. SAS automatically
generates the rules for topics that are
promoted to categories. If a categorical target variable
is pre-identified in the data, SAS can automatically
build rules for each possible level
using the documents as training data for
supervised machine learning. You can also create custom
categories or a taxonomy from scratch. And finally, you can leverage
automatic rule generation to quickly produce syntax,
reducing your time spent writing rules which,
in turn, allows you to focus on refining those
rules with your subject matter expertise. SAS Visual Text Analytics
easily scales the process of reading, organizing and
extracting useful information from large volumes
of textual data. It runs on the SAS
Platform and offers multi-threaded
parallel processing for in-memory analytics on a
cloud-ready, open architecture. REST APIs allow for
flexible integration and users have the choice to
code in SAS, Python, R, Java, Scala or Lua. You can deploy models in batch,
Hadoop, in stream and via APIs. Score code is natively threaded
for distributed processing, taking maximum advantage
of computing resources to reduce latency to results. Look at your
organization and consider the possible revelations your
unstructured data may hold. The power of gaining insights
from that data is incredible and the best way to find them
is to combine natural language processing, machine learning
and human expertise. If you are interested in
learning more about SAS® Visual Text Analytics, please
visit www.sas.com/vta.

Only registered users can comment.

Leave a Reply

Your email address will not be published. Required fields are marked *