5/31/2022 HDS Task Force Meeting
Date
Antitrust Policy
Attendees
- Sean Bohan (openIDL)
- Peter Antley (AAIS)
- Jeff Braswell (openIDL)
- Nathan Southern (openIDL)
- Greg Williams (AAIS)
- Ken Sayers (AAIS)
- Megan Ebling (AAIS)
- Rajesh Sanjeevi
- Satish Kasala (Hartford)
- Truma Esmond (AAIS)
- James Madison (Hartford)
- Allen Thompson (Hanover)
- Tsvetan G (Senofi)
- Joan Zerkovich (AAIS)
- David Reale (Travelers)
- Dale Harris (Travelers)
Agenda:
- Discuss HDS needs, requirements (see notes below)
Time | Item | Who | Notes |
---|---|---|---|
Action items
Notes:
- What happened last week
- First Meeting under the name "HDS Task Force"
- first formal
- prior had been watercooler discussion around data modeling
- going forward - what is the HDS
- start back and not spend time the group has done prior - how do we move HDS forward
- James
- work on requirements (ask to send)
- Useful to look at a couple models, instead of one
- risky to try to do all in one layer
- balance
- useful to have notions of the b-model at highest level, persistence model, loading model
- business, presentation, persistence, loading
- b-model
- largely like list of elements, meaningful chunks
- presentation - optimized for writing queries
- some business but not concerned about system performance
- Persistence - preserving history and truth
- ultimate definition of truth, not necessarily easy to use
- Loading - get the data into the system
- compromise
- loaded data into persistence, presentation asks "what is easier to consume rather than truth"
- tried to build dimensional model that preserves all history is hard
- Business Model
- implementation and perf agnostic
- one step down from data dictionary
- Ken - excited, loading model is different
- James - entitles, attributes and relationships understood by business users
- biz user willing to dive in, but understandable
- no regard for implementation, tech, or processing req to load it
- Element - single piece of info
- entity - set of elements
- business entity that is not recognizable to business users and tends to occur in conversations - policy, claim, vehicle, and home are typical examples
- business model has to be careful about not getting crazy about abstract entitles
- relationship - use of entity names "user has a" "policy is a"
- model must not repeat the attributes of an entity in more than one place
- model must make it difficult to over-count the elements and values
- prevent excessive de-normalization
- if you take premium, and repeat premium for every claim, could over-sum
- reasonable level of denormalization
- business model should make it difficult
- sparsity is the measure of empty values to total number of values
- put homes and autos on the same row
- totally different attributes for each
- silly key, but bunch of elements filled out for one, others filled out, bunch of blanks
- humans love flat and wide
- put things together that are unrelated get sparsity problem
- not defining 8 normal forms right now
- will define in 3 min
- no entity when populated will have sparsity greater than 10%
- if occurs, entity reviewed
- entity that is highly covariant with another entity may be embedded in it
- Peter
- when get 10% for #11 - how do you come up
- James - throwing darts, consensus is what we decide
- "should be reviewed " - conversation trigger
- rationalization for normalizing, not just theory
- can make address its own domain
- most b people - its policy address (or garaging address)
- saying - when some entity is highly covariant, doesn't do much sep from it,
- denormalization argument
- #13 - laundry list, worth enumerating entity dictionary
- seed list
- Policy, Vehicle, Driver, Coverage, Claim, Claim Event, MORE
- list of what they are and what they need
- grouping data dictionary elements
- intro entities - intro relationships
- policy has one of more vehicles
- vehicle has one or more drivers
- policy has zero or many claims
- many claim events
- MORE
- Kudos from all
- Ken - great guidepost for coming up with a good model but not necessarily requirements
- needs to... <examples>
- Dale
- is that the requirements for HDS
- see this as a lower level than HDS
- how the HDS would be created but not necessarily the requirements
- will be helpful in guiding development
- still need bullet list requirements
- David
- this is great - good way to once we know we want to build, loves most
- should be sanity checks and guideposts
- most
- to answer any one question - what is the HDS supposed to do
- level of sparsity or normalization vs... if I don't have a stated biz req for what HDS is meant to do is touch
- might not need to, business requirements are more lax
- not sure what HDS is supposed to do
- Peter
- must support state DOI such as auto coverage and auto territory
- Truman - can't do "all DOI" must define each one
- Ken - short cut in the meeting
- Truman - cant skip that - critical to answering each question
- any two data calls
- annual, last year, year/half over course of year or a point in time
- cant go backwards
- Joan - is it possible that theres types of question, not answer every one, but fall into simple classes
- annual data call, one set of data 100s of questions answered from that set of data, very static
- other type - whats the state today - only a few classes of questions or sets that are queried
- simplify
- Truman - agree , not each state report, couple diff ones, returning over historical period that exists or a snapshot
- David - decouple it
- really important
- HDS should not be referencing specific use cases
- should be ref by the DMWG
- define DM that fits known use cases and predict future use cases
- implement and expose data elements
- any other requirements - easy and extensible
- don't define specific things
- requirement - HDS must implement data models developed by DMWG
- Dale - data must conform to DMWG data model standards
- David - it shall store and expose, make it accessible, make it real
- gist of it
- Dale - data should be current through 45 days
- David - Dales closer
- then we need to go one layer deeper
- these are decision points
- middle layer missing - perf of HDS
- bridging gap of what Dale wrote and this
- Peter - thinks DMWG will need more - very far from technical
- David - closes to business requirements
- user stories, non functional requirements, business requirements
- thing that must be done
- data that falls out of that - what we need to implement in HDS
- one layer deeper and above
- what are the requirements
- we don't need to know every change, just know and turned into a req
- req. - data is current as of X Date
- shall be exposed by HDS
- discuss them
- or could say "performance - what is acceptable - do we need instantaneous reads, what level of adhoc query do we need
- to figure out
- then decide how to build DM based on that
- Satish - traceability of the data
- trace thru HDS and back to source system
- implementation - could be keys
- others
- how easy it would be for the data calls to discover the data in hDS
- do we have a req where data call or extraction pattern, need to know what is in HDS
- what attributes do we have
- what is the level of relationship of those attributes
- Ken - model we all know
- Satish - discoverable in realtime?
- do we have discoverable data requirements? confirming to a model
- David - a business requirement - what the user experience is for the data callers
- what the expectations are
- start from the user experience and user stories and work our way down
- how will it be used and how will carrier load into it
- two user stories not terribly well defined today
- what will that look like, how structure
- not defined
- Satish - must provide data current to 45 days prior - hinting at historical data? do we want to explicilty define history we maintain?
- Dale - 5 years + current year
- Peter - want to say CGL and NJ have weird reqs - closer to 7
- David - HDS will store data as mandated by regulatory statutes
- Truman - does it matter if apply to each co differently?
- David - not every req needs to be enforced by openIDL, on carrier to und that
- requirement - HDS must be current and have hold back period of x years, then the carrier can decide how they want to implement
- carrier can decide how to implement, req "it must be there"
- Ken - must meet retention requirements
- Dale - prior + 45 days
- processing, get thru it, makes sure it is right, dont get feeds every day, some monthly
- David - hold off period to make sure it is right
- what we need + req
- Peter - trailing 24 months right now, 45 days will make all real happy
- Dale - edit package, open IDL sponsored and maintained edit package the info goes through prior to landing in HDS
- must maintain min of 5% tolerance at state line level
- similar to SDMA
- Peter - can do based on SDMA
- Joan - part of NAIC req
- talk about what data needs to be avail for RR - 5% tolerance as well
- David - edit "before being in HDS" - when we load HDS could conform to 3 packages and fail 7 based on data call, could conform perfectly but their could be a new data call that doesn't
- Ken - example - provide VINs, I don't provide vins which would fail that call
- David - run against data in the HDS
- Peter - wont have NAIC data to get to the current reconciliation wont have long after 45 day period
- Ken - are we conflating 2 reqs - put data in with quality but might not meed reqs of data call
- 2 diff reqs
- Satish - is there any req on backwards compatibility
- DM will change over time
- are we mandating HDS is backwards compatible with previous vers of the DM
- Peter - in the camp of extraction patterns written against current iterations of model
- extraction model should tie to one model
- David - prob DMWG will update data model, will happen, all my old data when it updates might
- might not have existed on a policy 7 years ago
- some issues there where - needs to be define
- Ken - data model needs to change over time
- Satish - how does HDS need to behave
- Dale - even though conform to data model, mot every attribute in the data model
- minimum required elements provided for everyone, all others at the option of the carrier
- Ken - ND example of being very sparse and still participating
- Dale - if doing stat reporting, minimal amount of data required
- David - almost think those reqs can be handled by the queries, HDS for simplicity only has to adhere to DM going forward
- if DM changes, anything in the HDS can optionally be updated
- Dale - even the fields in the dm today, not all are required - just Day 1s
- David - query tells us if it is needed for a specific use case
- simple req on HDS - required fields only mandated to be updated when a new DM is released
- could always go backwards
- example - middle of 2023, DM updated, we might just re-run all 2023 against the new DM, but unlikely go back to 2019
- query checked - what data is there
- HDS shall only conform to DM going forward not backwards compatible
- Ken - how does new data relate to old data
- worries about migration of the model
- informs how we build the model
- ways to do it
- extensibility - lead to - less normalized more extensible kind of format
- sparsity but you could have struct of db not change with new data elements
- 2 very diff architectures
- Jeff - more of a graph approach with facts vs schema/tables
- David - more JSON like
- document-like
- easy to start loading new data in year 5
- query old data from year one
- ignores if not there
- whats there passes the check
- less performant
- with a performance requirement - would let us know if an acceptable solution
- Satish - carrier - change to the DM, v 2 - carrier 2 is still on V1, don't / wont migrate - how does the extraction pattern work?
- David - 2 EPs or 1 that checks for both versions, decouples DB from use case better
- if I am VA, running same data call for five years, only some carriers have new data, write EP that that hits both
- Ken - SLA approach - everyone shall conform by X time
- more consent, fewer dark corners
- David - carrier concern - whole value prop, set data, load HDS, takes churn off of us
- updating dMs every year, req fields added every year, looks less attractive, sla requirement
- very hard, erodes value prop
- Ken - aspire to migrations for those things, easier to run
- some of these will be trivial
- most part - once solved model
- adding fields not reorganizing
- David - governance and guidelines "not requesting carriers add more data every couple months"
- know there will be iterative
- needs to be sensible, but cant be constantly asking carriers to add more mandatory data
- Dale - there will be a governance process inside each carrier
- new item doesn't happen overnight
- David - unfunded mandate
- cant commit to mandated field from openIDL, internal gov processes for data David cannot guarantee success on
- mandate - these extra fields are mandatory, cant go back to. a line of business "put this in its mandatory" - they will say no
- even if you make it optional, too much drift/variation of data in there, it will be a mess
- major concerns here
- Jeff - standard model doesn't change too much, when they do it is thought out and structured
- high level reqs, not HDS but inform "how we make it extensible"
- Peter - if we add a new element shouldn't be backfilling for new model
- new attributes
- Jeff - policy - should you do it or not
- not just add a field and have a diff iteration of the model every month
- Peter - auto guy wants "yes/no" for usage based insurance
- Jeff - anticipation of structure vs availability of the data
- update history - - policy question
- more work
- concept of extraction request - federated and exists within diff carriers
- not a simple topic
- David - goes to why concerned about moving forward, if this is mechanism by doing data calls and asking for elements outside of standard - needs to be mechanism
- right now don't need to work about data call not matching stat reporting requirements
- needs to be easy and extensible to respond to data calls
- unless we expect regs to only be choosing data from regulated list of data models
- Dale - info not in HDS< may fall for carrier to do directly rather than openIDL
- never going to have all data calls going thru openIDL
- David - if it was really seamless, regulators look and say "i need this one element" and could easily sqy "we can add that" it would make our lives really easy,
- used for adhoc data calls
- if it is hard, can go around openIDL, less attractive
- hard to predict
- consider - might be beneficial for long term, 99% of data for data call, easy for carrier to add an element would make it easy for carriers
- Dale - picked up a lot of things for regulator in Day 2 and Day 3
- David - covid 19 - no one saw that exact data call coming
- always bespoke?
- Dale - might have been able to do it thru statistical data
- might have needed to get to coverage codes
- Peter - influenza of 1920 - 100 year disease
- David - req - we aren't going to cover every possible data call, trying to get the vast majority
- Peter - homework - getting regulators back together Friday
- short call
- altogether in the room on Friday
- talk about queries, sample questions
- agenda?
- Dale - working on what he views as business reqs
- 3 buckets
- data, data integrity, access and security and comms
- vetting with his folks at Travelers