Creating Datasets for HCPs

Learn about creating datasets for HCPs.

import com.reltio.analytics.framework._
import com.reltio.analytics.data.application._
import org.apache.spark.sql._

// Login to Analytics Framework
val framework = AnalyticsFramework.login(sqlContext, AF_API_URL, tenant, token)
 
// Build data frame
val df:DataFrame = framework.dataAccess
    .dataset(new EntityDatasetBuilder()
        .ofType("configuration/entityTypes/HCP")
        .select("Id")
        .select("attributes.FirstName") // simple attribute
        .select("attributes.Employment") // reference attribute
        .asTable("hcps")
    )
    .build()

This code will result in a Dataset with the following schema:

root
|-- Id: long (nullable = true)
|-- attributes: struct (nullable = true)
|    |-- FirstName: array (nullable = true)
|    |    |-- element: string (containsNull = true)
|    |-- Employment: array (nullable = true)
|    |    |-- element: struct (containsNull = true)
|    |    |    |-- Name: array (nullable = true)
|    |    |    |    |-- element: string (containsNull = true)
|    |    |    |-- Title: array (nullable = true)
|    |    |    |    |-- element: string (containsNull = true)
|    |    |    |-- Summary: array (nullable = true)
|    |    |    |    |-- element: string (containsNull = true)
|    |    |    |-- IsCurrent: array (nullable = true)
|    |    |    |    |-- element: boolean (containsNull = true)
|    |    |    |-- RefEntityID: array (nullable = true)
|    |    |    |    |-- element: long (containsNull = true)
|    |    |    |-- RefRelationID: array (nullable = true)
|    |    |    |    |-- element: long (containsNull = true)

The schema for the Employment reference attribute is similar to nested attributes. Two extra fields RefEntityID and RefRelationID are automatically added to the schema. These values can be used to manually join entities with reference objects inside an Analytics Framework or to utilize the underlying Reltio graph structure (for example, in GraphX).