Skip to main content

Design In public - Information Vault (Post 2)

 


This is part 2 of a series Design in Public, in the prior post I did a drill down on the requirements that are interesting for me.


  • Store Private Data for the Owner

  • Allow Private Data Owner to delete Private Data from the System thus disabling further use in the future.

  • Allow only internal Principals to retrieve it for a particular flow

  • All interactions from internal Principals where Private Data is wanted but not available anymore should be able know that it was deleted, who was the Owner and when it was Deleted


To store private data there is normally more than one type of information that is stored, for example Addresses, emails, phone numbers, Full Names, Social Security Numbers, Card number (Primary Account Number), Expiry Date. The number of interactions with the system could be modeled as 1 interaction per Data element or to be able to send multiple data elements on a single interaction. Allowing the client to decide how to implement this is better in this scenario so it means that designing for multiple data elements on a single call but only sending one element is possible, therefore I’ll continue with this assumption.


  • Store multiple Private Data elements for the Owner

  • Classification of data as provided by Owner/Caller from a defined list: Full Name, Telephone, Email


At this point I’ll leave out of scope the following


  • Deduplication of Data Elements on a single request. Which means that if the same data element is provided multiple times, each one will be treated independently on the backend.

  • Validation of data types and data classification. Which means that if a value is sent no attempt to match it to expected data format will be made. This would not be acceptable on a production ready system but this exercise is meant to be as simple as possible.


To delete private data the owner must provide the token as it is the only public reference for private data therefore acting as an Identifier, this means that each token must be unique. Now the topic becomes interesting as any token now must have sufficient entropy to avoid collisions and depending on what is the structure of the tokens this might be difficult to implement and considerations for range of output values must be done.


For example Names are the simplest to handle as a string of more than 12 characters with spaces could be used, same as email but with a ‘@’ character in the middle and some ‘.’ or even GUIDs or ULIDs if the downsides of using both of them are acceptable.


Telephone numbers might be tricky as it only expects numeric characters except for the area code at the start of the value and considering that duplicated values will consume another token from the available values. Validate exhaustion of the token pool is important in this case. Perhaps reuse of deleted tokens can be considered later but this has implications on consumer systems.


To exchange a Token for the original Private Data the client must present the Token and the system must validate authorization to exchange said token. This is where delimiting the security boundaries of the system is paramount.

To keep things simple let’s assume that Owner always uses the same public interface while Operators and Other systems have a different interface. This topology allows the use of different user pools or principal sources hence dividing the actions to different interfaces completely. Then there is a need to correctly match each token with a Owner to allow deletion of Private Data. This allows partitioning at Owner Level which also allows to be pragmatic on the Retrieval endpoint and force clients to also provide an Identifier for Owner which might already be in use in other parts of the system. The other benefit of this approach is that the token pool now has expanded as it is partitioned by Owner. Let’s look at the data access patterns.


  • Store Private data for Owner

  • Delete Private Data for Owner by Token

  • Retrieve Private Data by Owner and Token

Popular Posts

Are we truly engineers? or just a bunch of hacks...

I've found some things that I simply refuse to work without. Public, Centralized requirements visible to all parties involved. I is ridiculous that we still don't have such repository of information available,  there is not a sane way to assign an identifier to the requirements. Then we go with the 'it is all on Microsoft Office documents' hell which are not kept up to date and which prompts my next entry. Version control. When we arrived here quite a lot of groups were working on windows shared folders... now it is a combination of tools but heck at least there is now version control. Controlled environments and infrastructure. Boy... did I tell you that we are using APIs and tools that are out of support? Continuous deployment. First time here, to assemble a deliverable artifact took 1-2 human days... when it should have been 20 minutes of machine time. And it took 1 week to install said artifact on a previously working environment. And some other things that ...

Logffillingitis

I'm not against of leaving a trace log of everything that happens on a project what I'm completely against is filling documents for the sake of filling documents. Some software houses that are on the CMMI trail insist that in order to keep or to re validate their current level they need all their artifacts in order but what is missing from that picture is that sometimes it becomes quite a time waster just filling a 5 page word document or an spreadsheet which is just not adequate for the task needed. Perhaps those artifacts cover required aspects at a high degree but they stop being usable after a while either by being hard to fill on a quick and easy manner by someone with required skills and knowledge or they completely miss the target audience of the artifact. Other possibility is that each artifact needs to be reworked every few days apart to get some kind of report or to get current project status and those tasks are currently done by a human instead of being automated. ...

Qualifications on IT projects. Random thoughts

Projects exceed their estimates both in cost and time. Why? Bad estimation would be an initial thought. If you know your estimates will be off by a wide margin is it possible to minimize the range? Common practice dictates to get better estimates which means get the problem broken down to smaller measurable units, estimate each of them, aggregate results and add a magic number to the total estimate. What if instead of trying to get more accurate estimates we focused on getting more predictable work outcomes? What are the common causes of estimation failure: Difficult problem to solve / Too big problem to solve Problems in comunication Late detection of inconsistencies Underqualified staff Unknown. I'd wager that having underqualified staff is perhaps the most underestimated cause of projects going the way of the dodo. If a problem is too complicated why tackle it with 30 interns and just one senior developer? If it is not complicated but big enough why try to dumb it down a...

Job interviews

So after my sabatic period I started to go to different job interviews (most of them thanks to my fellow colleages whom I can't thank enough) and after most of them I feel a little weird. Everyone tries to get the best people by every means possible but then somethin is quite not right. Maybe they ask wrong questions, ask for too much and are willing to give to little in return or just plain don't know what they want or what they need. Our field is filled with lots of buzzwords and it is obvious that some people manage to get jobs only by putting them on their résumé. Then there are some places where there is a bigger filter and filters out some of the boasters. But still it is a question of what do they really need and what questions are needed to weed out those that do not cover minimal aspects required by the job. Don't get me wrong, it is really hard to identify good developers on an interview. It seems that almost no one knows what to ask in order to get insights abo...

On changes in career paths

A couple of things have changed since the last time I had to be on a job hunt. The most prominent one is the 'Honorary' title that we use as Software Engineers. This might be localized to Mexican software industry though. Seems that I was doing Staff stuff without being aware. When I first started, a senior developer might have been a Terminal role on many places meaning that there was nothing above that a coder could aspire to besides jumping to management side. There were specializations in fields like databases, business analysis or architecture but it felt as we were all on the same spot. Then I landed on a consultancy firm that had a more structured approach but still it was a single ladder that went from trainee, various levels of seniority and then something called Consultant with at least 3 different levels; advisor, full and senior if I remember correctly. Still there were specializations but it was simple enough. Consultant was something of a broad term for so...

On system integration and patterns

 On 2023 It would be really rare to have to create a new program that does not have distributed characteristics. But I guess this has been like that for a couple of decades and we have been using the same concepts by different name and using different techniques to solve the same problems. Even on the now called monolith systems there we were using the same concepts either consciously or forced upon by the restrictions of the technology. Take for instance the basic design patterns used in software like the Gang of Four (GoF), they used to be the bread and butter of everyday developer talk while discussing at the lowest coding level. But the same concepts have been applied at system, enterprise and beyond with some caveats. The important part is that patterns emerge and become common language for multiple organizations. Independent of the technology used these patterns arise and get implemented over and over but the concept remains stable. There are patterns that are easier to imple...