Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Discussion notes:

AWG Master diagram:

https://lucid.app/lucidchart/ac20d4e1-50ad-4367-b5cf-247ed9bad667/edit?viewport_loc=-301%2C-37%2C3200%2C1833%2CkgLSxXlpoGmM&invitationId=inv_de1a5e61-8edc-488a-90ee-8312f8c69cd4#

  • KS -
    • Work from this and other diagrams
    • Draw boxes as opposed to blank slate
  • PA - start broad, draw stuff until we have off
  • KS - can check off major functions, components in swim lanes, need to ingest fata, load it, data calls, extract, report - high level - can start from left and spitball flow of data as it comes through the system, get high level boxes in there, or as we go itemize whats in boxes - if doing it alone Ken would do all the above - start from somethign may or may not be close
  • KS - first step: getting data into HDS, through some means, means many mult DBs or systems are the source of this info, needs to be normalized - ETL is going to normalize data from diff sources into an openIDL format (HDS - first thing to happen) - another thing that is going to happen, we are goign to edit it (syntax, data errors in ETL ) then convert to HDS format and load into HDS format (all happens in Member Enterprise on :LEFT of diagram
  • JM - first point of concern, to do the edits, need to be in standardized format (for predictability)
  • KS - if standardizing edits, therefore the data input must be standardized at this point, putting data into standardized format - edit package will edit it for validity aganzt a number of rules, know we have a rules engine in the ETL (and rules repository lies in ETL)
  • JB - puts data into sep format and second step where it checks it
  • KS - ETL standardizes data, then ETL Edit engine (Rules Engine, Rules Repo) - 2 major steps going on inside here: ETL and Edit then converter to map it to format for HDS
  • KS - standardized, edited (rules engine and repo) then mapped to HDS format - coming out of here: only the valid
  • JB - warnings and exceptions based on issues
  • JM also intermediary data set, want to land data into some structure visible to run edit rules, some type of repo, biased towards persistence
  • JB - batch or message format
  • JM - stuff on fly needs troubleshooting, persisting diff issue (maintenance)
  • JB - just some kind of file format with w/ schema, well-defined structure
  • KS - after edit, thought we were keepoing all the records in the HDS, still true? Flagging those with errors? Or just daying
  • JB - 2 types: data errors - sanity check?
  • KS - ui for controlling edits and release of this stuff
  • JB - going to be issues with data vut if thresholds not above a cert threshold...  path, data quality checks - discussed everyhting submitted goes to HDS b/c it comes from the source
  • KS - SDMA func "this has x erroers but in a few months will release all data
  • JB - submissions - daily or weekly
  • JM then dont need UI - up to carrier to figure out how to pass all tests, 
  • KS - think we  heard wd rather have standardized process, need UI to get it to work - no consensus, thinks Dale
  • JB could be process that reads files and provides outputs
  • KS - 10k+ rules that are needed, people want to leverage
  • PA - as someone who used SDMA a lot, lot to be said for how that has allowed users to self-service
  • JB - supply somehing for all, add own rules, but also a standard set of rules all can use
  • JM - depends on what you want in interface, put UI that allows to see whats going on is fine vs UI for editing data
  • PA - doing it today, have both options,can update rows but a lot of times errors are when carrier ETL fails to make it correctly
  • KS - not happenign quick enough for cycle to finish - have to get it out in the week, can't get it fixed at the source but need tp be able to hit endpoint - then DB got complicated, error log records, etc. 
  • JB - sep files from corrected records, if there are tests for sanity check, level of quality, up to carrier to try to resolve?
  • KS - heard doesn't work, what we want to to but can't always do that - sometimes timeframe needs fix in SDMA, either data or process wont change in time
  • JB - spot in the middle, can't change box 1 in pic, you can change downstream, doesn't HAVE to mean HDS, basically a box 2 instead of UI for finxing it
  • KS - do it after standardization, not on carrier, up to openIDL footprint to make changes, IS IT PART of openIDL footprint to provide fixing for this data - how do we decide
  • PA - flip to PATL page, kind of boiled down version, SDMA today, what we can see being robust way of doing it - Carrier ingestion portal, do large edits before or stnadardarzed edit adfter, through package run against any data set, from ingeastion portal trivers a job to HDS - as soon as we put in working table, assign UUID, 
  • KS - not clear where error editing is happening
  • JM - how od you make changes? edits?
  • KS - have UI to make changes, shows tables, almost like excel editing
  • JB - who does that at carrier?
  •  PA - Susan or Reggie for ex at TRV - the business people in charge of loading data, something like this a feature a lot of companies would want to run, like TRV bypass whole system, smaller companies would want it
  • KS - TRV currently fixing data with SDMA - every now and then we can't get the back end to feed the right data
  • PA - thinks way we are right now, making all work with this workflow, TRV isn't making changes to data lake, submitting excel with why #s (adjustment artifact) why adjust should happen - as of today AAIS is not updating records in data lake, allowing to edit at load, working in ingestion portal
  • KS - need to decide if thats a req for the system or not, can say "if we have it, where would it be?"
  • PA - whole secondary thing in HDS, error table and correction table in HDS, not there today
  • JB - baseline data qual check, simply check data for acceptibiloty (errors under threshold) and pass along if not exceeded OR if exceeded carrier would need to update, have means to edit records and proecc using tools from openIDL - baseline data qual check where data meets cert qual, if  not accepted would carrier allow to be edited or back on carrier
  • JM - day, 1, day 2, day 3 - simplest design, carrier loads HDS and then done - maybe set flag "ready to go" and if it fails test back up batch and put it back in, no staging area or UI needed to vchange
  • JB - loading into DB and back it out, squirrely, having format to check on way in (instead of loading garbage)
  • JM - babystep: v1 is flat load and back out (otherwise staging area to run rules against it) v2 fail and fix, v3 do whatever it takes (typically have aspot where in flow can modify data, OR formal facility in there  - day 1, 2, 3 question) - mixing delivery with Arch
  • JB - likes V2, simplest way to start, door open, if we think a modification interface, qual checs vs modifications, cant let crappy data- load, iterate across, - how do fixes re-apply if you reload stage, fixes in robust design, put them in a fixit body of tables, re-apply fixits - fixit gets big fast
  • PA - non robust vers TODAY of what JM described, once you see issue, so we make the right call to reload or...
  • JM - load 1x, modify with fixit interface
  • JB - copy, fix it, resubmit - still have sep files
  • JM - prob with fixit, copy made must have structure, could be overwritten with reload
  • PA - haven't ironed out - 4 csv in one day for auto (1 month worth of loading) - are those 4 CSV one job? mult docs per job? 
  • JB - org'd as sep submissions or files
  • KS - some scope of identifier of this package of data, work on package identifier, dont release until edit/removed
  • PA - pivor slightly, passing 47 of 48 states, is there a facility fto load 47 who passed or all-or-nothing?
  • JM - batch ID mech on all tables, put it into stage, its a batch - if you chose to say "loaded 48 states as a batch and one is wrong, back them all out, individually? <SEAN REVIEW TIME 45min)
  • JB - file in a folder means needs to be processed
  • KS - wouldn't want huge company like Hartford wouldn'et want to equate file to a batch (mult batch to file, mult file to batch)
  • PA - why would a batch have a state/line indiviation - whether pass or fail 4 sep batches
  • JB rules by state?
  • PA - depends, rules are more national for most part but passing "is this data for this state/line/timeframe valid? 
  • KS - can batch be orthogonal to reporting? Sent bunch of data, want to fix it all if it doesn't work only found one state screwed up - can't pass the state, batch is 
  • PA - if you have your batch transcend states and lines, tagging errors to non errors
  • JB - data quality checks record by record, if exceed %/proportion bad collection, may have stats on where errors came from, not sure to enforce pre-sorting of data
  • PA - if i have set of data (NC and SC) and I submitted should machine render 2 sept batches
  • KS - can we do it without calling new batch? return errors "AZ so these records dont pass" and have choice to fix, split and fix, etc. - doesn't invalidate batch to have one part of it messed up
  • JB - idea - check record for format, some point too many errors, problem, if collex are from 2 diff states then need better analytics on data quality, enhancement of data quality checks - make it as easy as possible for carriers to submit data
  • JM - agree place to stage things, rule to be run - day 1 fix process is on carrier, bolting on fix UI wont break prior architecture - add data qual state column to design, edit package will respond "bad, good, warning" - only answering data quality not changes
  • KS - does req control mech to release data to allow 
  • JM - happy path - gets data in staging area, normal process kick off the rules, give answers and return answers to scheduler, if value is past safety threshold, if passes, done, if fails fires off email
  • KS - control DB? Job flow?
  • JM - if everythng works it just works, if edit is fast run the load to HDS otherwise intervention event - fire email to team/ group notified there is a problem
  • JM - if edits return "pass" you let the job flow  - have to go on assumption 99% of time batches run and work, worried about fixit approach - ideal is und why things are happening, these batches should just flow 
  • KS - pipeline approach
  • JM - fan of persistent stage, 3 major boxes of ETL integrating with staging, stat model we can all live with on day 1, believes we will check 10-12 EPs, day 1 should be  stat model
  • JM - keep adapter as small as it if, explain it

Time

Item

Who

Notes