Getting Started with Oeid3.1, Part 2 - Oracle endeca information discovery overview

Welcome back to Getting Started with Oracle Endeca Information Discovery v3.1. In this screencast, you’ll learn about the different types of data you can store in the system for analysis, and get an overview of the three major parts of Oracle Endeca Information Discovery. Let’s start with where data is stored in Oracle Endeca Information Discovery – Endeca Server. Endeca Server is a hybrid analytic search database that organizes complex and varied data from multiple source systems. Business users use discovery applications to query Endeca Server, which uses advanced data structures and algorithms that allow real-time responses to requests. An important need by companies is the ability for business users to upload data into Endeca Server themselves, without any help from IT, and build their own discovery applications to explore that data. Since users don’t always know what questions to ask of the data until they actually start interacting with it, data exploration must be intuitive, allowing users to quickly find answers to new questions as they explore. And users need to be able to build these discovery applications quickly and easily, and then be able to share these discovery applications with others. All of these capabilities are part of Endeca Information Discovery Studio. You’ll get a good introduction to Studio in Parts 3 and 4 of this screencast series. But organizations need to be able to load and explore large amounts data too, like enterprise data, and that data could include structured, semi-structured, and unstructured data. Let me show you what I mean. An example of structured data is sales transactions. An example of semi-structured is my product catalog; each product is described in its own unique way. And an example of unstructured data is product reviews. Let’s look at building out 2 records with all this data. For the structured data it’s actually pretty straightforward. You’ll have attribute-value pairs that look fairly similar across records: transaction ID, product ID, category, etc. It becomes more challenging for semi-structured data. For each product, the records may be laid out differently. And product reviews, could also be completely different from record to record. In this example, we’re looking at a simple sentence describing a customer review but some records could be half a page, and others could be two pages of information. We want to bring this data in and make it available just like the other data. We also want to enrich that data, like pulling out certain pieces of information from those reviews; for example whether the review was positive or negative. We can also pull out terms that provide insight such as “great”, “off road”, and “heavy”. With this example, these two records become very rich, and self-describing, so we know everything about this product in each record. And each record from one to the next can start to look very different. So we need diverse data integration – the ability to bring in structured data, semi-structured data and completely unstructured data. We also want to be able to use data from a variety of sources: data warehouses, departmental data, data from enterprise applications, external data and social media data, so we see the full picture of what people are saying about our products and services. We need a way to load this into Endeca Server, and then create discovery applications to allow users to explore the data. To load this type of data, we need an industrial-strength agile ETL tool with extensions for file and web crawling and text enrichment. Together, this is known as EID Integrator, and is used to quickly integrate and enrich the various types of data. So there you have it – the three major components of Oracle Endeca Information Discovery: Endeca Server, Studio, and Integrator – all working together to allow for the ingest of both data from a single business user and large, complex data from various sources – with the end result of discovery applications, allowing for intuitive exploration of the data. In this screencast, you received an overview of Oracle Endeca Information Discovery. In the next screencast, you will see Oracle Endeca Information Discovery in action. Using sample data, I’ll demonstrate how to quickly explore data using advanced search tools, faceted navigation, and interactive visualization components.