WO2011157708A1 - Methods and systems for securely handling datasets in computer systems - Google Patents

Methods and systems for securely handling datasets in computer systems Download PDF

Info

Publication number
WO2011157708A1
WO2011157708A1 PCT/EP2011/059846 EP2011059846W WO2011157708A1 WO 2011157708 A1 WO2011157708 A1 WO 2011157708A1 EP 2011059846 W EP2011059846 W EP 2011059846W WO 2011157708 A1 WO2011157708 A1 WO 2011157708A1
Authority
WO
WIPO (PCT)
Prior art keywords
dataset
safe
partitions
document
owner
Prior art date
Application number
PCT/EP2011/059846
Other languages
French (fr)
Inventor
Christian Breitenstrom
Gerd SCHÜRMANN
Radu Popescu-Zeletin
Jens Klessmann
Andreas Penski
Original Assignee
Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. filed Critical Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
Publication of WO2011157708A1 publication Critical patent/WO2011157708A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0816Key establishment, i.e. cryptographic processes or cryptographic protocols whereby a shared secret becomes available to two or more parties, for subsequent use
    • H04L9/085Secret sharing or secret splitting, e.g. threshold schemes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/1824Distributed file systems implemented using Network-attached Storage [NAS] architecture
    • G06F16/1827Management specifically adapted to NAS
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/556Detecting local intrusion or implementing counter-measures involving covert channels, i.e. data leakage between processes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/604Tools and structures for managing or administering access control systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0894Escrow, recovery or storing of secret information, e.g. secret key escrow or cryptographic key storage
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2117User registration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2141Access rights, e.g. capability lists, access control lists, access tables, access matrices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2149Restricted operating environment

Definitions

  • cloud computer environment is understood to be an environment where at least one cloud computer system is provided in addition to classical storage systems like individual computers, external storage media like USB sticks or smartcards etc.
  • Cloud computing describes in general IT services based on a network, in particular the Internet, and it typically
  • the cloud customer can only rely on Service Level Agreements with the cloud provider based on certifications and regular security reviews by internal and trusted third parties. Even the cloud space providers have the problem to prove that every endeavor has been made to keep the customers data confidential. So they spend lots of money without being sure or able to guarantee that their own administrators are not leaking confidential data.
  • One embodiment is a method for safely handling at least one dataset, in particular a document, in a cloud computer environment, wherein
  • the at least one dataset is partitioned into at least two dataset partitions
  • the at least two dataset partitions are stored on and / or retrieved from at least two computer sites being part of a cloud computer environment, so that on no computer site all dataset partitions of the at least one dataset are present, in particular not present at the same time.
  • Storing datasets, e.g. sensitive documents, distributed over a cloud computer environment enhances the security of the individual documents. Access to an individual partition of the dataset would not compromise the security of the complete documents since the partition itself does not provide
  • Another embodiment is a method for safely handling at least one dataset, in particular a document, in a cloud computer environment, wherein a retrieval of a partitioned dataset to combine the at least one dataset comprises
  • the dataset partitions are transformed into a logical representation of the at least one dataset.
  • a safe system stores the distribution-information of the datasets which are to be distributed to e.g. storage
  • a further embodiment is a method for safely handling at least one dataset, in particular a document, in a cloud computer environment, wherein a storing of a partitioned dataset derived from the at least one dataset comprises
  • FIG. 1 showing schematically a document with several document partitions and its distribution
  • FIG. 2 showing schematically a document partitions
  • FIG. 3 showing schematically a logical and physical
  • FIG. 4 showing schematically the interactions of a safe owner
  • Fig. 5 showing an embodiment of a read mechanism involving a safe owner, a safe provider and storage providers
  • Fig. 6 showing an actor model for an electronic safe system
  • Fig. 7 showing an electronic life cycle of an safe system
  • FIG. 8 showing the collection of data involving a safe system owner and a safe user
  • Fig. 9 showing a safe user infrastructure
  • Fig. 10 showing the process of a safe user sending a release order request to a safe system.
  • the method and the system are used to improve the security of datasets 10, 20, in particular documents (10, 20).
  • a document 10, 20 is a special instance of a computer file, comprising blocks of arbitrary information or a resource for storing information, being available for a computer program. Therefore, documents 10, 20 or files are to be considered separate from programs, which can be used to, e.g., create, alter or store documents or files.
  • a dataset 10, 20 is understood to be even more general than a file since it makes not inherent assumption about the internal structure.
  • a dataset is understood to be a set of binary data which can be distinguished from other datasets.
  • a logical representation of a document 10, 20 is a
  • the physically distributed representation of a document 10, 20 contains distribution-information (e.g. a distribution tag 11, 12, 13) that enables the person 1000 authorized to access the document 10, 20 to transform the document 10, 20 back to its logical representation.
  • distribution-information e.g. a distribution tag 11, 12, 13
  • the tag name 11, 12, 13, which can be the name of the file (or a part thereof) , allows the identification of the document partitions 1, 2, 3, 4.
  • the document 10, 20 can be encrypted and then be subjected to a partition into the partitions 1, 2, 3, 4.
  • the partitions can be stored at the storage providers under a number. The storing with the storage providers is performed together with the storing of a so called domain-pseudonym. This way the person 1000 authorized to access the document 10, 20 can read the documents 10, 20 without giving away the identity of the user, in particular the owner.
  • the physically distributed representation of the document 10, 20 is the physical partition of the document 10, 20 into document partitions 1, 2, 3, 4. Those document partitions 1, 2, 3, 4 can be
  • a storage provider is a service to store arbitrary datasets, in particular documents 10, 20. It might be a professional IT service provider, a cloud provider, personal computers and / or mobile storage devices (USB cards, mobile phone etc) . Embodiments of the methods and systems for administering documents 10, 20 in distributed form is illustrated in Fig. 1, 2 and 3.
  • a document 10 is shown as one logical
  • the document 10 is divided into three different document partitions 1, 2, 3, each of the document partitions 1, 2, 3 is associated with a distribution tag 11, 12, 13 as distribution-information.
  • the partitions 1, 2, 3 are stored in different computer sites 100, 200, 300 at storage providers.
  • the document partitions 1, 2, 3 are datasets comprising binary data.
  • the safe system 1010 is a personalized collection of services to store, retrieve and manage the physical distribution representations of documents 10, 20.
  • the document partitions 1, 2, 3 can be arbitrarily chosen, i.e. the document 10 can be arbitrarily divided.
  • the document partitions 1, 2, 3 can, but they do not have to have the same size.
  • the number of document partitions 1, 2, 3 is two or greater than two so that a distribution over different sites (e.g. computer servers) 100, 200, 300, 400 (see Fig. 2, 3) can be achieved.
  • One way to partition a document 10, 20 is the secret sharing algorithm of Rabin.
  • the method of Rabin produces inherently partly-redundant partitions in a
  • partitions 1, 2, 3 have at least one redundant part. If the document 10, was e.g. partitioned by the algorithm of Rabin, two of the three partitions would be sufficient to recombine the document 10.
  • the partitioning is not bound to any specific algorithm, especially algorithms providing redundancy may be applied to increase the safety against (temporary or permanent)
  • the concrete number of partitions required depends on the partitioning algorithm applied. In the simplest case of non- redundant, non-overlapping partitioning, if one of the document partitions 1, 2, 3 is missing and / or corrupted, the document 10 cannot be recombined. So if an internal or external attacker gets access to a computer system with one of the document partitions 1, 2, 3 (or one of its copies), the content of the complete document 10 cannot be known to him. If a portion is sufficiently small and/or carries several, non-continuous parts of document 10, even if
  • partitions 1, 2, 3 is encrypted. This can mean that the original file 10, 20 has been encrypted, so that its
  • the document partitions 1, 2, 3 can be encrypted separately prior to or after the distribution of the document partitions 1, 2, 3 to different servers 100, 200, 300, 400. Before the mechanisms of the distribution are discussed, the advantage of the separation of responsibilities is described.
  • a first mechanism in order to achieve trustworthy cloud spaces is the separation of responsibilities. For most sensitive private or business documents 10, 20 the owner 1000 has to be in full control of the document 10, 20.
  • a storage provider is responsible for availability
  • a storage provider can be the user using his own computer, a
  • documents 10, 20 allows a better separation of concerns in the storage and retrieval of sensitive documents 10, 20 and a better mechanism against breaching attacks.
  • the separation of the logical representation of the document 10, 20 and its physically distributed representation of the document 10, 20 over different independent sites is described in Fig. 1, 2,
  • the owner 1000 controls the access to his documents 10, 20 via the safe system 1010.
  • the safe system 1010 logically comprises (i.e. information about the storage locations of the documents 10, 20) all documents 10, 20 and their related management information.
  • the safe system 1010 might work with documents 10, 20 in different formats.
  • a safe owner 1000 owns two documents 10, 20.
  • Those documents 10, 20 can be e.g. a pdf-file and a doc-file.
  • the safe owner 1000 can own any number of documents 10, 20 which can all have the same file format or which can have different file formats as shown in Fig. 3.
  • the safe owner 1000 e.g., hires a safe system 1010 provided by the safe provider. In other alternatives the safe owner
  • the 1000 can have more than one safe 1010 with more than one safe provider.
  • the safe provider is an entity or organization that provides safe systems and is liable for the safe.
  • the safe system 1010 provider and the safe owner 1000 can be the same entity.
  • the safe owner 1000 owns the contents of his safe 1010 or safes .
  • the physically distributed representation A shows four instances of data (indicated by the binary strings) . These are documents partitions 1, 2, 3, 4 of the documents 10, 20 which are distributed over several sites 100, 200, 300, 400.
  • partitions 1, 2, 3, 4 does not decrease the safety and security of the documents by distributing them over the net; the distribution rather improves the safety and security by not storing the document partitions 1, 2, 3, 4 in one place.
  • the physical data of the documents 10 is distributed as document partitions 1, 2, 3, 4 across different physical storage providers 100, 200, 300, 400, the client computer, internal and external storage devices so that every physical storage provider can only access the physical portion of the document 10, 20 he owns.
  • the storage site 100, 200, 300, 400 dynamically change. This can be achieved by the safe systems 1010 which dynamically can allocate new storage sites. This can be made according to a random (or pseudo ⁇ random) process. The dynamic shifting of the portioned document makes an unauthorized retrieval of the document 10, 20 even more difficult.
  • the transformation and the recovery mechanism is realized at the client side and controlled by the owner 1000 of the documents 10, 20.
  • Each storage provider is responsible only for the physical part of the document 10, 20 (i.e. the document partitions 1, 2, 3) which was assigned to him. He is not responsible for confidentiality of the document 10, 20, as each single physical document partition does not reveal any information about the original document 10, 20 (logical view of the document ) .
  • any insider attack at any storage provider cannot be successful since only parts of the physical data (i.e,. the some individual document partitions 1, 2, 3, 4) can be addressed.
  • the safe mechanism provides the trustworthy storage
  • a safe client e.g. a mobile device connectable to the safe system 1010 may encrypt or/and partition the document and store it at independent Storage Providers (see Fig. 2 and 3) .
  • the safe client may distribute the physical document
  • This distribution and storage is performed at interconnected storage units as: mobile devices (USB, smartcards, mobile phones etc.), own computer, cloud(s). Connectivity between these storage devices and the safe client is a prerequisite.
  • the confidentiality is achieved by physical document 10, 30 distributions, not by trust and regulation.
  • FIG. 4 This is shown in Fig. 4.
  • an owner 1000 is interacting with a mobile device 1001.
  • the document partitions 1, 2, 3, 4 are stored in a distributed way on sites 100, 200, 300, 400.
  • a mobile device 1001 becomes a safe owner key, increasing the user perception and responsibility towards trust and security in the cloud storage system.
  • the owner 1000 of the mobile device 1001 e.g. mobile phone, mobile computer
  • the documents 10, 20 would not be stored on the mobile device 1001, reducing the risk that sensitive documents 10, 20 are lost with the mobile device .
  • the logical information is split at the client side. That means, that the only location, where the logical data is available in plain text format (or in another human readable format), is on the safe client. Different mechanisms can be provided to secure the safe client.
  • Fig. 5 shows an embodiment of a read mechanism involving a safe owner (safe client), safe provider (with the safe system 1010) and storage providers.
  • the read process is started with an authorization of the owner 1000 at his or her safe system 1010 at the safe
  • the distribution information about the document partitions 1, 2, 3, 4 is obtained, the information is
  • the authorization makes sure, that only the legitimate owner 1000 of the safe system 1010 may access it.
  • a trustworthy electronic safe system 1010 for data and documents is developed.
  • a client device e.g. a mobile device.
  • the distribution-information is kept on the client device, while the document partitions 1, 2, 3, 4 are distributed (i.e. stored) at various storage providers.
  • a safe system 1010 is introduced to keep the distribution-information (preferably encrypted) on a safe system provided by a safe provider, while the partitioning and / or recombination of the partition is still performed on the client device.
  • the safe system 1010 infrastructure requires multiple actors, the safe owner 1000, that manage her private data within the safe system 1010, the storage provider that stores small data blocks of content, the safe provider that stores some kind of directory information and the safe user that is getting insight into the owner's safe system 1010. It should be noted that the person 1000 can be the safe owner, but it does not necessarily has to be the case.
  • An electronic safe system 1010 for datasets and in particular documents 10, 20 offers the safe owner 1010 confidentiality and availability of information stored in it.
  • the storage facilities are organized in a strongly decentralized manner (e.g. by the methods and systems described above), so the safe systems 1010 is not a conventional database, registry or something similar.
  • the information stored in the safe system 1010 is assured not to be recoverable even on the long term without the safe owner's 1000 or her delegate's consent.
  • the Safe Owner 1000 is the natural person, who collects her data within the Safe.
  • the Safe Provider is an IT solution provider that operates the Safe.
  • the Safe Owner 1000 can easily access all Safes with the same Safe Client software.
  • the Safe User is the party that might receive grants to access parts of the personal data of the Safe Owner.
  • a Safe User is able to send data to the Safe Owner, without needing explicit grants from the Safe Owner to do so. It is assumed that there is a constant "safe-address" that is known to the Safe User.
  • Storage Provider is an IT service provider that offers highly available storage services. Following the insight, that security should not rely only on technological means but should be supported by complementary legal and organizational provisions (Schneier 2009), a notary is established that takes responsibility in cases, where the Safe Owner is not capable to act personally, e.g. the Safe Owner dies or she loses the Safe access etc.
  • the Safe Owner can act as a Safe User in relation to another Safe, as well as the Safe User might have its own Safe.
  • public administrations As described above, a Safe Owner might have multiple Safes with different Safe
  • the Safe Infrastructure comprehends the Safe
  • Collaborating Storage Providers could use their combined knowledge to find the parts belonging together. Given some encryption used they would have to wait until the algorithm or the key size is weak enough to successfully decrypt the Safe Owner's secrets. That's why we need to make sure, that Storage Providers don't know the origin of the incoming data. To prevent them to track the IP address we use available anonymity services.
  • the base technologies like anonymous credential systems
  • Trustworthy infrastructures require fundamental measures, not only on a technical but also on the organizational and legal level. The underlying assumptions are:
  • the IT-service providers are capable to provide highly available storage services
  • Safe Owner is not assumed to be an IT affine responsible person who manages to keep her computer clean from viruses and malware.
  • the Safe Owner buys the services around the electronic Safe from the Safe
  • the Safe Provider issues personalized certificates and sends the certificates on some specialized hardware and the corresponding initialization passwords to the Safe Owner. Here the same security considerations apply as for the rollout of signature cards.
  • the Safe Provider is responsible to allocate the reliable IT-resources in his own
  • the operating of the Safe comprises all the transactions of the Safe Owner, the storage of valuable personal information, the access of other Safe Users to released data and the information of the Safe Owner in case of incoming release order requests.
  • the maintenance of the network of Safe comprises all the transactions of the Safe Owner, the storage of valuable personal information, the access of other Safe Users to released data and the information of the Safe Owner in case of incoming release order requests.
  • the locking and unlocking of an electronic Safe occurs when the Safe Owner has lost his credentials or does not pay the agreed subscription fees. In this case the Safe Owner will not be able to manage her data anymore but other Safe Users can still access the data previously released.
  • the Safe Owner might give up her electronic Safe. If she used the electronic Safe in e-government transactions there will be some other Safe Users, i.e. public authorities, relying on the accessibility of the released data. These data have to be stored until the retention period is expired. After the expiration date the data are deleted automatically. There have to be mechanisms to extend the retention period with consent of the Safe Owner and it is an open topic what to do if the Safe Owner has given up the electronic Safe in the meantime.
  • release order request contains the specification of all the information that is necessary for a particular administrational service.
  • release order request contains exactly one document request.
  • release order requests contain lists of data groups, like home address, birth information, not the data items like street, number, month of birth etc..
  • the data groups are structured around the conventional forms that we use today to apply for administrational services.
  • the use case of an incoming release order request might include the information to the Safe Owner about a new waiting message by SMS or other convenient channel.
  • Next step is an activity of the Safe Owner, who receives the release order request in her Safe Owner Client, evaluates its requesting party and the underlying context, checks which data groups they request and grants access to this bundle of data. The grants can be withdrawn, as long as the Safe User is not working with these data. In most cases the Safe User - the public administration - will retrieve these data and check the completeness of the released data.
  • a special use case is the combination of several successive release order/grant access communications.
  • the Safe Owner initiates a more complex administrative process, that collects data dependent on the data received in a previous step, then the "online session" is the intended convenience functionality.
  • the Safe Owner gives kind of a repeated access grant to a specified administrative process during this session.
  • REQ_S02 An eavesdropper MUST be prevented to get any
  • REQ_S03 It MUST NOT be possible for an adversary to retrieve any data that belong to another Safe Owner or any data that is part of a release order request.
  • REQ_S04 The Safe Owner SHOULD be able to grant permissions to save, send, copy, print her data. Therefore the owner SHOULD get to know, whether the requesting system is a system that can enforce these permissions (mutual attestation of the involved systems) .
  • REQ_S05 The Safe Owner and the Safe User MUST be provided a way to discover, when the Safe Client and the Safe
  • REQ_S06 The Safe Owner and the Safe User MUST be able to use the Safe Infrastructure with knowledge (convenient password) and some property that is easy to take with them and
  • the safe MUST to be designed in such a way that multiple steps of safe incorporation are possible.
  • First step is the complementary usage of safes to facilitate data collection via conventional forms.
  • Second step SHOULD be the incorporation into process infrastructures.
  • Third step MAY be the exclusive access via trusted clients.
  • REQ_O02 As nationwide public key infrastructures are not yet available, the safe protocols MUST be formed to recognize that. So Release Order Requests MUST transmit the public key of the Safe User and the Safe Provider MUST provide a service to retrieve the public key of the Safe Owner. It MUST be possible to validate signatures of Safe Users offline.
  • REQ_O03 The released data MUST remain unchanged, until the retention period is expired.
  • REQ_O04 The Safe Owner MUST be able to have some backup, in case she loses the access media.
  • REQ_T01 As most service oriented architecture building blocks rely on web services, the protocols used SHOULD be built upon the web services stack and comply to the WS-I recommendations .
  • REQ_B01 Multiple clients MUST be usable, e.g. personal computer, kiosk system, mobile phone.
  • REQ_B02 Standards and open systems MUST be used wherever possible to create a broad community of Safe Providers.
  • REQ_B03 The XML data structures to be stored in the safe MUST be extensible, they will result from and be used in forms of all kinds. The design MUST allow to map available to required data and to incorporate existing schemata.
  • REQ_B04 The response time of the Safe Client application SHOULD compare to current web applications (search engines) .
  • REQ_B05 The safe MUST allow the versioning of all data and documents stored in it. Proposed Solution
  • REQ_S04 it disallows any data manipulation, duplication or storage without consent of the Safe Owner. Since it operates on a trusted computing infrastructure the Trusted Viewer can provide proof of a well-defined state and that it will perform as expected. Our Safe will deliver the Safe-Owners data only after verifying that proof. So, the Safe-Owner is in control of her sensitive data which does not disperse around. Seen from the process modeling view, the Trusted Viewer is a desktop where the public officer executes human interaction tasks that require working with real sensitive data.
  • the Safe Owner is able to proof the ownership of a pseudonym. That's why it is reasonable to decide applications based on
  • the Safe request management is used by each of the clients, i.e. the Trusted Viewer(s) and the Rule Engine.
  • Rule Engine is capable to execute simple tests on the data retrieved from the electronic safe. A certain amount of incoming applications could be pre-evaluated by this rule engine without any human involvement. This could potentially ease the human interaction tasks later on. Additionally it might be viable to let the rule engine work on the real sensitive data and to blind these data in following steps where humans are involved.
  • X.509 certificates are used throughout the communication, whenever the communicating parties need to know each other, e.g. to sign release order requests or to hide the
  • the idemix certificates are used whenever the service
  • the Safe User needs a Certificate Issuance to be able to issue role based certificates. If the citizen releases some data, these data are encrypted for a particular receiver.
  • the receiver might be a natural person or a role, e.g. the
  • the model of the data follows the idea, that the valuable information, e.g. a document, is split into separate chunks of bytes that are stored in little "buckets" with different Storage Providers.
  • the information, which chunks belong to which document, is encrypted and stored with the Safe
  • the buckets stored with the Storage Provider are secured against unauthorized access by cryptographic means, so called idemix domain pseudonyms.
  • the information model in its central classes is documented in (Penski 2010) .
  • Safe User Communication protocols The splitting of each information into chunks of bytes that are stored with independent Storage Providers involves even the release order requests.
  • the diagram in Fig. 10 abstracts from necessary authentication and authorization procedures with Safe Provider and Storage Providers.
  • Starting point is the Public Officer that works on a certain application of a person who owns an electronic Safe. The officer specifies in his Trusted Viewer which information is needed and which Safe should be used.
  • the Safe Request Management will first retrieve the public key of the Safe Owner. This key is used later on to encrypt the information, where the release order request is stored and how the Safe Owner client is able to reassemble the information. Then the release order is split into several parts that are subsequently stored with
  • Storage Provider is not able to
  • Provider is capable to break the encryption of this
  • Rabin M (1989), "Efficient dispersal of information for security, load balancing, and fault tolerance", Journal of the ACM, Vol.36, pp. 335-348. Schneier, B. (2009) “Crypto-Gram-Newsletter", [online], http : / /www . schneier . com/crypto-gram-0901. html .

Abstract

Methods and systems for safely handling at least one dataset, in particular a document (10, 20), particularly in a cloud computer environment are described, e.g. wherein a) the at least one dataset (10, 20) is partitioned into at least two dataset partitions (1, 2, 3, 4), b) the at least two dataset partitions (1, 2, 3, 4) are stored on and / or retrieved from at least two computer sites (100, 200, 300, 400) being part of, e.g., a cloud computer environment, so that on no computer site (100, 200, 300, 400) sufficient, in particular all, dataset partitions (1, 2, 3, 4) of the at least one dataset (10, 20) are present, in particular not present at the same time.

Description

Methods and Systems for securely handling datasets in
computer systems
Systems for collaborative sharing of datasets such as
documents („cloud-ready" ) are known in the art. However the trust in these cloud-based infrastructures to store personal, sensitive and /or confidential data is limited.
In the following a cloud computer environment is understood to be an environment where at least one cloud computer system is provided in addition to classical storage systems like individual computers, external storage media like USB sticks or smartcards etc.
Cloud computing describes in general IT services based on a network, in particular the Internet, and it typically
involves the provision of dynamically scalable and often virtualized resources as a service over the network. Typical cloud computing providers deliver business applications online which are accessed from another web service or
software like a web browser, while the software and data are stored on servers.
Solutions to prevent data leaks often get complex and the cloud customers have no chance to control the security measures taken by the cloud service provider to prevent insider/outsider attacks. The investment in data leak
prevention tools and the necessary security consulting is often avoided.
The cloud customer can only rely on Service Level Agreements with the cloud provider based on certifications and regular security reviews by internal and trusted third parties. Even the cloud space providers have the problem to prove that every endeavor has been made to keep the customers data confidential. So they spend lots of money without being sure or able to guarantee that their own administrators are not leaking confidential data.
Anonymous credential systems have been described by
Chaum, D. ("Security without identification: Transaction systems to make big brother obsolete", Communications of the ACM 28(10), ppl030-1044) and
Camenisch, J. and Herreweghen, E.V. ((2002) "Design and implementation of the idemix anonymous credential system", CCS '02: Proceedings of the 9th ACM conference on Computer and communications security, New York, NY, USA, ACM Press, pp 21-30. ) .
Secret sharing has been described by Shamir and Rabin (Rabin, M (1989), "Efficient dispersal of information for security, load balancing, and fault tolerance", Journal of the ACM, Vol.36, pp. 335-348) .
Since the mid 90s technical infrastructures suitable for electronic document safes are discussed with different objectives in mind: - performance and fault tolerance: (Paul, A. et al . (2007), "e-SAFE: An Extensible, Secure and Fault Tolerant Storage System", Proc. IEEE Self-Adaptive and Self-Organizing
Systems, (SAS0 Ό7), IEEE Press, pp. 257-268, doi:
10.1109/SASO.2007.21,
- confidentiality in dispersed untrusted storage environments (Zhang, et al . (2008), „Towards A Secure Distribute Storage System", Advanced Communication Technology, (ICACT 2008), IEEE Press, Apr. 2008, pp. 1612-1617,
doi:10.1109/ICACT.2008.4494090) , (US-A 7,349,987), (Iyengar, A. et al . (1998), "Design and Implementation of a Secure Distributed Data Repository", In Proc. of the 14th IFIP Internat. Information Security Conf. , pp.123—135.) (Kubiatowicz et al . (2000), "Oceanstore: An architecture for global-scale persistent storage", Proc. of the ninth
international conference on Architectural support for programming languages and operating systems (ASPLOS '00), ACM SIGARCH Computer Architecture News, Dec. 2000, pp.190-201, ISSN: 0163-5964) , as distributed p2p systems (US Patent
Application 20060078127), as smart card extension (US-A
7,206,847), as backup.
It is necessary to develop methods and systems for securely handling datasets, in particular files or documents, in a cloud computing environment. In following a number of embodiments are described which address this issue. A person skilled in the art will recognize that those embodiments can be combined to derive variations of the embodiments
described .
One embodiment is a method for safely handling at least one dataset, in particular a document, in a cloud computer environment, wherein
a) the at least one dataset is partitioned into at least two dataset partitions,
b) the at least two dataset partitions are stored on and / or retrieved from at least two computer sites being part of a cloud computer environment, so that on no computer site all dataset partitions of the at least one dataset are present, in particular not present at the same time. Storing datasets, e.g. sensitive documents, distributed over a cloud computer environment enhances the security of the individual documents. Access to an individual partition of the dataset would not compromise the security of the complete documents since the partition itself does not provide
sufficient information to reconstruct the whole document. Another embodiment is a method for safely handling at least one dataset, in particular a document, in a cloud computer environment, wherein a retrieval of a partitioned dataset to combine the at least one dataset comprises
a) an authorization by an authorized user, in particular an owner of the at least one dataset at a safe system,
b) on positive completion of the authorization, distribution information regarding the dataset partitions is automatically retrieved from the safe system,
c) all dataset partitions of the at least one dataset are retrieved from computer sites,
d) the dataset partitions are transformed into a logical representation of the at least one dataset. Here a safe system stores the distribution-information of the datasets which are to be distributed to e.g. storage
providers. The term "authorization" is meant in this and the following contexts that the user, in particular an owner, is in the possession of a right to do something.
A further embodiment is a method for safely handling at least one dataset, in particular a document, in a cloud computer environment, wherein a storing of a partitioned dataset derived from the at least one dataset comprises
a) an authorization by a user, in particular an owner of the at least one dataset at a safe system,
b) on positive completion of the authorization, distribution information regarding the dataset partitions is automatically transferred to the safe system,
c) all dataset partitions of the at least one dataset are automatically stored on computer sites as physical sub- representations of the at least one dataset, so that none of the computer sites comprises more than one instance of the dataset partitions, in particular not at the same time.
We use a distribution mechanism, especially a physical distribution mechanism of the data, in particular documents, across multiple independent storage servers to increase the security of the data and documents and increase the trust in the Cloud environment. Different embodiments are described in an exemplary way in the following figures,
Fig. 1 showing schematically a document with several document partitions and its distribution;
Fig. 2 showing schematically a document partitions
distributed over several servers;
Fig. 3 showing schematically a logical and physical
representation of data;
Fig. 4 showing schematically the interactions of a safe owner ; Fig. 5 showing an embodiment of a read mechanism involving a safe owner, a safe provider and storage providers;
Fig. 6 showing an actor model for an electronic safe system; Fig. 7 showing an electronic life cycle of an safe system
Fig. 8 showing the collection of data involving a safe system owner and a safe user; Fig. 9 showing a safe user infrastructure;
Fig. 10 showing the process of a safe user sending a release order request to a safe system. In a first embodiment, the method and the system are used to improve the security of datasets 10, 20, in particular documents (10, 20). A document 10, 20 is a special instance of a computer file, comprising blocks of arbitrary information or a resource for storing information, being available for a computer program. Therefore, documents 10, 20 or files are to be considered separate from programs, which can be used to, e.g., create, alter or store documents or files. A dataset 10, 20 is understood to be even more general than a file since it makes not inherent assumption about the internal structure. A dataset is understood to be a set of binary data which can be distinguished from other datasets.
Even though the embodiments described below are not limited to documents 10, 20, for the sake of simplicity the
embodiments are described in the context of documents (10, 20), rather than datasets.
Before describing embodiments of methods and systems, a distinction in the representation of documents 10, 20 (or files) is given. For the sake of simplicity, the embodiment will be described in the context of documents 10, 20 only.
A logical representation of a document 10, 20 is a
representation that enables a person 1000 with the correct authorizations for file usage on a computer system to, e.g., distribute, retrieve, interpret and process a physical distribution of the document 10, 20.
The physically distributed representation of a document 10, 20 contains distribution-information (e.g. a distribution tag 11, 12, 13) that enables the person 1000 authorized to access the document 10, 20 to transform the document 10, 20 back to its logical representation. The tag name 11, 12, 13, which can be the name of the file (or a part thereof) , allows the identification of the document partitions 1, 2, 3, 4. The document 10, 20 can be encrypted and then be subjected to a partition into the partitions 1, 2, 3, 4. The partitions can be stored at the storage providers under a number. The storing with the storage providers is performed together with the storing of a so called domain-pseudonym. This way the person 1000 authorized to access the document 10, 20 can read the documents 10, 20 without giving away the identity of the user, in particular the owner. In the embodiment described here the physically distributed representation of the document 10, 20 is the physical partition of the document 10, 20 into document partitions 1, 2, 3, 4. Those document partitions 1, 2, 3, 4 can be
distributed over different independent storage providers. A storage provider is a service to store arbitrary datasets, in particular documents 10, 20. It might be a professional IT service provider, a cloud provider, personal computers and / or mobile storage devices (USB cards, mobile phone etc) . Embodiments of the methods and systems for administering documents 10, 20 in distributed form is illustrated in Fig. 1, 2 and 3.
In Fig. 1 a document 10 is shown as one logical
representation, as it exists on one particular computer system, e.g. a safe client or mobile device. The document 10 is divided into three different document partitions 1, 2, 3, each of the document partitions 1, 2, 3 is associated with a distribution tag 11, 12, 13 as distribution-information. The partitions 1, 2, 3 are stored in different computer sites 100, 200, 300 at storage providers.
If the individual document partitions 1, 2, 3 are known to a safe system 1010 (see Fig. 3, i.e. the distribution- information is known to the safe system 1010) they can be recombined to the document 10 in the correct order using the distribution tags 11, 12, 13. The document partitions 1, 2, 3 are datasets comprising binary data.
In principle it is possible to keep the distribution- information 11, 12, 13 on the client and distribute the partitions 1, 2, 3 directly to the computer sites 100, 200, 300. The safe system 1010 is a personalized collection of services to store, retrieve and manage the physical distribution representations of documents 10, 20. The safe service
provides the authorized user, in particular the owner of sensitive documents the capability to transform her documents from physical to logical representation and vice versa.
The document partitions 1, 2, 3 can be arbitrarily chosen, i.e. the document 10 can be arbitrarily divided. The document partitions 1, 2, 3 can, but they do not have to have the same size. The number of document partitions 1, 2, 3 is two or greater than two so that a distribution over different sites (e.g. computer servers) 100, 200, 300, 400 (see Fig. 2, 3) can be achieved. One way to partition a document 10, 20 is the secret sharing algorithm of Rabin. The method of Rabin produces inherently partly-redundant partitions in a
configurable manner.
It is not mandatory to store redundant parts (e.g. obtained by the secret sharing algorithm of Rabin) of documents partitions 1, 2, 3' on different servers 100, 200, 300 but this increases the availability and therefore the safety of the document storage. In Fig. 2 each of the document
partitions 1, 2, 3 have at least one redundant part. If the document 10, was e.g. partitioned by the algorithm of Rabin, two of the three partitions would be sufficient to recombine the document 10. The partitioning is not bound to any specific algorithm, especially algorithms providing redundancy may be applied to increase the safety against (temporary or permanent)
inaccessibility of document partitions or systems due to failures.
To ensure the trustworthiness of the cloud computer
environment only the person 1000 holding a sufficiently large subset or all of the document partitions 1, 2, 3 can
recombine the document partitions 1, 2, 3 to the one document 10. The concrete number of partitions required depends on the partitioning algorithm applied. In the simplest case of non- redundant, non-overlapping partitioning, if one of the document partitions 1, 2, 3 is missing and / or corrupted, the document 10 cannot be recombined. So if an internal or external attacker gets access to a computer system with one of the document partitions 1, 2, 3 (or one of its copies), the content of the complete document 10 cannot be known to him. If a portion is sufficiently small and/or carries several, non-continuous parts of document 10, even if
unencrypted, having access to one document portion 1, 2, 3 would not provide meaningful information to an attacker.
In the case of the secret sharing algorithms of Rabin, a subset of the document partitions 1, 2, 3, 4 suffices to recombine the document 10, 20.
In a further embodiment at least one of the document
partitions 1, 2, 3 is encrypted. This can mean that the original file 10, 20 has been encrypted, so that its
partitions 1, 2, 3 obtained their encryption from the
original document 10, 20. Alternatively or additionally the document partitions 1, 2, 3 can be encrypted separately prior to or after the distribution of the document partitions 1, 2, 3 to different servers 100, 200, 300, 400. Before the mechanisms of the distribution are discussed, the advantage of the separation of responsibilities is described.
A first mechanism in order to achieve trustworthy cloud spaces is the separation of responsibilities. For most sensitive private or business documents 10, 20 the owner 1000 has to be in full control of the document 10, 20.
A storage provider is responsible for availability,
consistency and confidentiality of the data. A storage provider can be the user using his own computer, a
professional storage provider outside or a cloud provider in the future. Currently the documents 10, 20 are stored
logically and physically at the same location. This - by nature - makes them vulnerable to attacks from inside and outside the storage location. This is one of the issues the embodiments shown here address.
The separation into a logical and a physical view of
documents 10, 20 allows a better separation of concerns in the storage and retrieval of sensitive documents 10, 20 and a better mechanism against breaching attacks. The separation of the logical representation of the document 10, 20 and its physically distributed representation of the document 10, 20 over different independent sites is described in Fig. 1, 2,
3. The owner 1000 controls the access to his documents 10, 20 via the safe system 1010.
The safe system 1010 logically comprises (i.e. information about the storage locations of the documents 10, 20) all documents 10, 20 and their related management information. The safe system 1010 might work with documents 10, 20 in different formats. In Fig. 3 a safe owner 1000 owns two documents 10, 20. Those documents 10, 20 can be e.g. a pdf-file and a doc-file.
Naturally the safe owner 1000 can own any number of documents 10, 20 which can all have the same file format or which can have different file formats as shown in Fig. 3.
The safe owner 1000, e.g., hires a safe system 1010 provided by the safe provider. In other alternatives the safe owner
1000 can have more than one safe 1010 with more than one safe provider. The safe provider is an entity or organization that provides safe systems and is liable for the safe. The safe system 1010 provider and the safe owner 1000 can be the same entity.
The safe owner 1000 owns the contents of his safe 1010 or safes .
In Fig. 3 the physically distributed representation A shows four instances of data (indicated by the binary strings) . These are documents partitions 1, 2, 3, 4 of the documents 10, 20 which are distributed over several sites 100, 200, 300, 400.
Only the logical view B of the documents 10, 20 makes the document partitions 1, 2, 3, 4 readable and processable. In this sense the cloud (represented by systems 100, 200, 300, 400 here) becomes a necessary infrastructure to provide the distribution infrastructure for the documents 10, 20 at the physical level for higher safety and / or security of the logical documents. The distribution of the document
partitions 1, 2, 3, 4 does not decrease the safety and security of the documents by distributing them over the net; the distribution rather improves the safety and security by not storing the document partitions 1, 2, 3, 4 in one place.
The physical data of the documents 10 is distributed as document partitions 1, 2, 3, 4 across different physical storage providers 100, 200, 300, 400, the client computer, internal and external storage devices so that every physical storage provider can only access the physical portion of the document 10, 20 he owns.
In a further embodiment at least for a subset of the at least two document partitions 1, 2, 3, 4 the storage site 100, 200, 300, 400 dynamically change. This can be achieved by the safe systems 1010 which dynamically can allocate new storage sites. This can be made according to a random (or pseudo¬ random) process. The dynamic shifting of the portioned document makes an unauthorized retrieval of the document 10, 20 even more difficult.
The transformation and the recovery mechanism is realized at the client side and controlled by the owner 1000 of the documents 10, 20.
External attacks on one hand can be successful only if a sufficiently large number of (in the simplest case all) physical parts of the data (i.e., document partitions 1, 2, 3, 4) is (are) online (reducing the time of a potential attack) and by breaching the independent storage sites of the document. Simultaneous attacks at different locations are in general very difficult if not impossible to achieve. This required synchronity can be deliberately used by setting a schedule so that the distributed storage sites for the documents partitions 1, 2, 3, 4 are only online on certain times, known to the owner 1000 only. This is another example that distributing data (e.g. document partitions 1, 2, 3, 4) over a cloud can increase safety rather than decrease it.
Each storage provider is responsible only for the physical part of the document 10, 20 (i.e. the document partitions 1, 2, 3) which was assigned to him. He is not responsible for confidentiality of the document 10, 20, as each single physical document partition does not reveal any information about the original document 10, 20 (logical view of the document ) .
If the document partitions are distributed to several storage providers, any insider attack (at any storage provider) cannot be successful since only parts of the physical data (i.e,. the some individual document partitions 1, 2, 3, 4) can be addressed. The safe mechanism provides the trustworthy storage
capability of sensitive documents 10. On top of this
mechanism the usual encryption mechanisms can be used to increase the safety and / or security of the systems. A safe client (e.g. a mobile device connectable to the safe system 1010) may encrypt or/and partition the document and store it at independent Storage Providers (see Fig. 2 and 3) . The safe client may distribute the physical document
representation based on different algorithms to independent Storage Providers.
This distribution and storage is performed at interconnected storage units as: mobile devices (USB, smartcards, mobile phones etc.), own computer, cloud(s). Connectivity between these storage devices and the safe client is a prerequisite. The confidentiality is achieved by physical document 10, 30 distributions, not by trust and regulation.
This is shown in Fig. 4. Here an owner 1000 is interacting with a mobile device 1001. As shown in Fig. 3 the document partitions 1, 2, 3, 4 are stored in a distributed way on sites 100, 200, 300, 400.
A mobile device 1001 becomes a safe owner key, increasing the user perception and responsibility towards trust and security in the cloud storage system. Actually the owner 1000 of the mobile device 1001 (e.g. mobile phone, mobile computer) can temporarily have access to his documents 10, 20, i.e., access the document partitions 1, 2, 3, 4 from the cloud of the sites 100, 200, 300, 400, combine the document partitions 1, 2, 3, 4 to the document 10, 20 on the mobile device 1001, work with the documents 10, 20 and then transmit the document partitions 1, 2, 3, 4 back to the cloud. The documents 10, 20 would not be stored on the mobile device 1001, reducing the risk that sensitive documents 10, 20 are lost with the mobile device .
As mentioned above the logical information is split at the client side. That means, that the only location, where the logical data is available in plain text format (or in another human readable format), is on the safe client. Different mechanisms can be provided to secure the safe client.
Fig. 5 shows an embodiment of a read mechanism involving a safe owner (safe client), safe provider (with the safe system 1010) and storage providers.
The read process is started with an authorization of the owner 1000 at his or her safe system 1010 at the safe
provider administering the safe system 1010. From the safe provider, the distribution information about the document partitions 1, 2, 3, 4 is obtained, the information is
decrypted. In general it is advantageous to encrypt the stored data (e.g. distribution information) at the safe provider . Now the individual document partitions 1, 2, 3, 4 are
retrieved from the storage providers. This is a loop which lasts until the complete document 10 is restored to the safe client of the owner 1000, by first assembling the physical partitions of the documents 10. Typically there are n Storage Providers involved. The degree of security of the safe concept increases with the number of independent storage providers for a certain document 10, 20. The write interaction occurs in reverse order: first the safe client stores the physical partitions at independent storage providers, then she encrypts the Meta-information and at last she stores this encrypted Meta-information within her
safe provided by the safe provider. The authorization (and authentication) makes sure, that only the legitimate owner 1000 of the safe system 1010 may access it.
In the following further aspects and embodiments of the safe handling of datasets 1, 2, 3, 4 are described, in particular with applications to electronic safes for process Oriented e- Government . The person skilled in the art will recognize, that the embodiments described above are applicable to the methods and systems described below and vice-versa. Today highly available, scalable storage is very common and cheap and many "document safes" or collaborative document sharing systems are available on the market. However the trust in server based infrastructures to store personal, sensitive data is limited. Major data breaches in public and private sector service providers suggest that there is a need for a more structural answer to these problems.
Solutions to prevent data leaks often get complex and
expensive on their own, so that public administrations tend to avoid the investment in security consulting and the necessary infrastructure. Even the operating of such complex infrastructures requires skilled and engaged personal which makes it expensive. To rely on organizational measures and the operating stuff means to be attackable to a certain degree. Last but not least today's e-government solutions depend heavily on the experienced end user that is capable to keep his personal computer clean from viruses and malware. This seems at least questionable, as it is challenging to keep up with the development of new anti-virus software.
In the following a model for critical infrastructures is described that keeps personal, sensitive datasets 10, 20 confidential for a long time without relying on trustworthy IT solution providers, loyal personal or secure networks. The proposed infrastructure can be used to store documents 10.
20, e.g. the scan of a birth certificate as well as XML data, e.g. "place of residence" that can be used by the citizen in all kinds of application processes. The confidentiality of stored datasets 10, 20, the unobservability of communication and the unlinkability of user transactions are targeted.
The aspects of secure storage are separated and a concept of a trustworthy electronic safe system 1010 for data and documents is developed. In principle it is possible to partition the documents on a client device, e.g. a mobile device. Then the distribution-information is kept on the client device, while the document partitions 1, 2, 3, 4 are distributed (i.e. stored) at various storage providers. In a further embodiment, a safe system 1010 is introduced to keep the distribution-information (preferably encrypted) on a safe system provided by a safe provider, while the partitioning and / or recombination of the partition is still performed on the client device. The safe system 1010 infrastructure requires multiple actors, the safe owner 1000, that manage her private data within the safe system 1010, the storage provider that stores small data blocks of content, the safe provider that stores some kind of directory information and the safe user that is getting insight into the owner's safe system 1010. It should be noted that the person 1000 can be the safe owner, but it does not necessarily has to be the case.
The description starts with the key requirements structured in use cases and their explanation. Afterwards essential parts of a prototypical implementation for all of the actors and the protocol between them for a chosen scenario are described . Process oriented e-government
Traditionally public administrations are structured along public duties. Many public authorities and their departments are tailored accordingly. This orientation often leads to procedures, which end at the organizational boundaries of individual authorities. In the age of IT-based processes this task oriented approach is too narrow. In many cases
administrative processes are just one part of larger
processes, cross-cutting through organizational limits and different levels. Especially businesses have to provide preliminary results in order to fulfill administrative requirements. Once public authorities have handled the requests, the businesses have to re-integrate them into their internal processes. Thus the necessary cooperation of public and private sector can function with less friction if there are less format mismatches. In the interest of a high- performing European Union a process oriented alignment of the public sector is thus necessary. Process oriented e-government is the result of the paradigm shift from a task-oriented, regionally distributed paper based public management towards the collaborative cooperation of various public agencies and service providers. User centric process management emphasizes the central role of the citizen in e-government processes. The European
Services Directive (DIRECTIVE 2006/123/EC) strengthens the position of applicants in e-government processes in order to promote growth and create jobs in the European Union. The directive demands transparency of even complex processes. The applicant should be able to understand and control the steps in the application process.
Electronic Safes as trustworthy e-government infrastructures
An electronic safe system 1010 for datasets and in particular documents 10, 20 offers the safe owner 1010 confidentiality and availability of information stored in it. The storage facilities are organized in a strongly decentralized manner (e.g. by the methods and systems described above), so the safe systems 1010 is not a conventional database, registry or something similar.
The digital analogy of a conventional safe is capable to keep the usage and the communicating parties confidential. The resulting communication traces are useless for any
eavesdropper. The information stored in the safe system 1010 is assured not to be recoverable even on the long term without the safe owner's 1000 or her delegate's consent.
Electronic safes (or safe systems 1010) are essential e- government infrastructure components, as they reduce the repetitive data collections at the beginning of each
workflow. Beside personal data the electronic safe stores results of administrative processes in terms of electronic certificates. The public administration gains on much higher data quality when these certificates are reused later on. So the electronic safe is a modern privacy enhancing component and on the same time a critical infrastructure that makes e- government processes more efficient. Electronic safes promise to secure the availability and integrity of data and documents as a basis for the decision- making processes of the administration. They make it
legitimate to relief the public administrations of the archiving of data and documents of the applicant, which is desirable for economic and privacy reasons.
Proposed model
Separation of responsibilities The first idea behind the concept of electronic safes is the separation of responsibilities.
For most users it seems easy to store their personal
information on their own computer. In this case the user is full in control of her data but is responsible for
availability i.e. regular backups and confidentiality of the data. This is getting more and more time consuming, because it includes keeping the personal computer free of viruses and other malware. At the same time the user will only be able to access the data when she has physically access to her
computer .
Both reasons lead to a situation where the user is ready to give up some of her control over the data. She orders a professional service to store her data with some "trusted storage provider" and will be able to access the data
whenever she has access to the Internet. This could include the usage of the data in e-government processes too. As this scenario is desirable it fails often, because we have a single instance that is responsible for available,
confidential storage and we have network connections that can be eavesdropped. It is proposed to split and e.g. encrypt the dataset, in particular documents 10, 20 into many pieces of a "puzzle" and store these puzzle-pieces with independent storage providers. Each storage provider is responsible only for the available storage of a single piece of the data. He is not responsible for confidentiality, as each single piece does not reveal any information about the original "plain text" (i.e. the dataset or document 10, 20) .
Any eavesdropper will gain only puzzle-pieces (probably even encrypted) of information that are useless without the others. So the confidentiality is achieved by design not by trust and regulation.
Different actors and their roles are described in Fig. 6. The Safe Owner 1000 is the natural person, who collects her data within the Safe. The Safe Provider is an IT solution provider that operates the Safe.
Multiple Safe Providers on the market can follow some
standard protocols. So the Safe Owner 1000 can easily access all Safes with the same Safe Client software. The Safe User is the party that might receive grants to access parts of the personal data of the Safe Owner. A Safe User is able to send data to the Safe Owner, without needing explicit grants from the Safe Owner to do so. It is assumed that there is a constant "safe-address" that is known to the Safe User. The
Storage Provider is an IT service provider that offers highly available storage services. Following the insight, that security should not rely only on technological means but should be supported by complementary legal and organizational provisions (Schneier 2009), a notary is established that takes responsibility in cases, where the Safe Owner is not capable to act personally, e.g. the Safe Owner dies or she loses the Safe access etc. The Safe Owner can act as a Safe User in relation to another Safe, as well as the Safe User might have its own Safe. In a scenario related to e-government we consider public administrations as Safe Users. As described above, a Safe Owner might have multiple Safes with different Safe
Providers. The Safe Infrastructure comprehends the Safe
Client application, the Safe Provider web services, the
Storage Provider web services and the anonymity services each with the underlying web container, database, trusted
operating system and hardware. Anonymous communication
Even with the separation of responsibilities outlined above an eavesdropping adversary would be able to infer
communication patterns and communicating parties.
Collaborating Storage Providers could use their combined knowledge to find the parts belonging together. Given some encryption used they would have to wait until the algorithm or the key size is weak enough to successfully decrypt the Safe Owner's secrets. That's why we need to make sure, that Storage Providers don't know the origin of the incoming data. To prevent them to track the IP address we use available anonymity services.
However anonymous communication does not mean that everybody can use the storage services. Only the regular owner and the authorized user of the data will be able to retrieve the data. Communicating parties that use the Electronic Safe can verify each other's identity. An accounting based on some subscription model is adopted. After all, the model is ready to be integrated into e-government processes.
Related work
The base technologies like anonymous credential systems
(Chaum 1985) (Camenish 2002) and secret sharing (Rabin 1989) have been invented long time ago. Since the mid 90s technical infrastructures suitable for electronic document safes are discussed with different objectives in mind: performance and fault tolerance (Paul 2007), confidentiality in dispersed untrusted storage environments (Zhang 2008) (Redlich 2008) (Iyengar 1998) (Kubiatowicz 2000), as distributed p2p systems (Cacayorin 2004), as smart card extension (Albert Jr. 2000), as backup etc. Document storage services come in various flavors: as collaborative environment, as file sharing facility, or as part of web-conferencing tools. There are huge document safe infrastructures deployed, e.g. the
Austrian cyberdoc system, that connects notary's offices all over Austria or the Danish e-Boks system, that facilitates communication between Danish citizens, companies and public authorities - to name only two of them. None of these systems however provides the targeted privacy, availability and confidentiality on the long term. With the concepts presented here the Safe is incorporated into e-government processes while preserving the privacy keeping properties of the conventional safe. Cheap storage services are used - both to make it easy to come up with a new conformant storage service - so to enhance the secure base of independent Storage
Providers and to make the safe operating more attractive for Safe Owners.
Requirements analysis The history of electronic signature cards shows that it is a cumbersome task to roll out fundamental e-government
infrastructures. We have to consider the whole life cycle of the electronic Safe, create real benefits for the Safe Owner, evaluate the organizational needs of public administration as Safe Users, make the model attractive for potential Safe
Providers to name only some of the tasks. In particular the relation between real benefit for the Safe Owner and
attractiveness for the Safe User to incorporate the
electronic Safe into its processes is a chicken or the egg causality dilemma. For the sake of brevity we concentrate on the requirements that directly have an impact on the Safe User's protocol and the integration into e-government processes. We use the key words as in RFC 2119 defined.
Trust model
Trustworthy infrastructures require fundamental measures, not only on a technical but also on the organizational and legal level. The underlying assumptions are:
The IT-service providers are capable to provide highly available storage services,
On the long term data losses within IT-service providers cannot be excluded,
On the long term data breaches within IT-service
providers from outside or inside will occur,
It is possible, that different IT-service providers
cooperate to recover the content of safes against the consent of a Safe Owner,
There is a certain chance that there will be some kind of a superior instance which tries to control the independent IT service providers.
Finally: the Safe Owner is not assumed to be an IT affine responsible person who manages to keep her computer clean from viruses and malware.
Life-cycle of an electronic Safe From the citizen's point of view the life cycle (see use cases in Fig. 7) of an electronic Safe starts and ends with the contract to a Safe Provider. Nevertheless there are other activities concerning the reliable transfer of Safe Services. Comparable to certificate service providers (CSP) the Safe Providers are liable to transfer their existing Safes to another Safe Provider when they close down their business. The allocation of an electronic Safe implies the registration of the citizen with the Safe Provider. This may be achieved by personal identification or online with strong
authentication, i.e. national ID-card. The Safe Owner buys the services around the electronic Safe from the Safe
Provider. The Safe Provider issues personalized certificates and sends the certificates on some specialized hardware and the corresponding initialization passwords to the Safe Owner. Here the same security considerations apply as for the rollout of signature cards. The Safe Provider is responsible to allocate the reliable IT-resources in his own
administrative domain as well as in the domains of the available Storage Providers. The operating of the Safe comprises all the transactions of the Safe Owner, the storage of valuable personal information, the access of other Safe Users to released data and the information of the Safe Owner in case of incoming release order requests. The maintenance of the network of Safe
Providers and Storage Providers is a substantial part of that position .
The locking and unlocking of an electronic Safe occurs when the Safe Owner has lost his credentials or does not pay the agreed subscription fees. In this case the Safe Owner will not be able to manage her data anymore but other Safe Users can still access the data previously released.
The Safe Owner might give up her electronic Safe. If she used the electronic Safe in e-government transactions there will be some other Safe Users, i.e. public authorities, relying on the accessibility of the released data. These data have to be stored until the retention period is expired. After the expiration date the data are deleted automatically. There have to be mechanisms to extend the retention period with consent of the Safe Owner and it is an open topic what to do if the Safe Owner has given up the electronic Safe in the meantime.
Essential use cases
Besides from the base functionality like storing and
retrieving datasets 10, 20 it is interesting to discuss how the electronic Safe is integrated into e-government
processes .
We start with a so called "release order request", see Fig. 8, that the Safe User, i.e. public authority, sends to the electronic Safe. The release order request contains the specification of all the information that is necessary for a particular administrational service. To give an example:
suppose the Safe Owner applies online for a dedicated parking permit. For this administrational service the applicant has to provide an official statement about his current place of residence that is not older than 3 months. In this case the release order request contains exactly one document request. Typically release order requests contain lists of data groups, like home address, birth information, not the data items like street, number, month of birth etc.. The data groups are structured around the conventional forms that we use today to apply for administrational services. The use case of an incoming release order request might include the information to the Safe Owner about a new waiting message by SMS or other convenient channel.
Next step is an activity of the Safe Owner, who receives the release order request in her Safe Owner Client, evaluates its requesting party and the underlying context, checks which data groups they request and grants access to this bundle of data. The grants can be withdrawn, as long as the Safe User is not working with these data. In most cases the Safe User - the public administration - will retrieve these data and check the completeness of the released data.
If the data are well prepared enough to base a decision on, the public authority requires that these data remain
unchanged for documentation reasons until a certain retention period is expired. For this to happen the Safe User locks the released data.
A special use case is the combination of several successive release order/grant access communications. When the Safe Owner initiates a more complex administrative process, that collects data dependent on the data received in a previous step, then the "online session" is the intended convenience functionality. Here the Safe Owner gives kind of a repeated access grant to a specified administrative process during this session.
While we focus here on the Safe User communication and the incorporation into e-government processes, please find a list of other relevant use cases in (Albert Jr. 2000) . Key requirements related to the integration in e-government processes
Security related key requirements summarized REQ_S01: The Safe Provider and the Storage Providers MUST NOT be able to infer the communication partners of the Safe Owner or the exchanged data.
REQ_S02: An eavesdropper MUST be prevented to get any
knowledge about the stored or retrieved content. It MUST NOT be possible to store the exchanged encrypted messages until their decryption is possible to recover the safe content. REQ_S03: It MUST NOT be possible for an adversary to retrieve any data that belong to another Safe Owner or any data that is part of a release order request.
REQ_S04: The Safe Owner SHOULD be able to grant permissions to save, send, copy, print her data. Therefore the owner SHOULD get to know, whether the requesting system is a system that can enforce these permissions (mutual attestation of the involved systems) .
REQ_S05: The Safe Owner and the Safe User MUST be provided a way to discover, when the Safe Client and the Safe
Infrastructure they are working with, is in an untrustworthy state (trusted display problem) .
REQ_S06: The Safe Owner and the Safe User MUST be able to use the Safe Infrastructure with knowledge (convenient password) and some property that is easy to take with them and
difficult to exchange without notice of the user.
Organizational
REQ_O01: The safe MUST to be designed in such a way that multiple steps of safe incorporation are possible. First step is the complementary usage of safes to facilitate data collection via conventional forms. Second step SHOULD be the incorporation into process infrastructures. Third step MAY be the exclusive access via trusted clients.
REQ_O02: As nationwide public key infrastructures are not yet available, the safe protocols MUST be formed to recognize that. So Release Order Requests MUST transmit the public key of the Safe User and the Safe Provider MUST provide a service to retrieve the public key of the Safe Owner. It MUST be possible to validate signatures of Safe Users offline. REQ_O03: The released data MUST remain unchanged, until the retention period is expired.
REQ_O04: The Safe Owner MUST be able to have some backup, in case she loses the access media.
Technical requirements
REQ_T01: As most service oriented architecture building blocks rely on web services, the protocols used SHOULD be built upon the web services stack and comply to the WS-I recommendations .
Business requirements
REQ_B01 : Multiple clients MUST be usable, e.g. personal computer, kiosk system, mobile phone.
REQ_B02: Standards and open systems MUST be used wherever possible to create a broad community of Safe Providers.
REQ_B03: The XML data structures to be stored in the safe MUST be extensible, they will result from and be used in forms of all kinds. The design MUST allow to map available to required data and to incorporate existing schemata.
REQ_B04: The response time of the Safe Client application SHOULD compare to current web applications (search engines) . The initial time to open up the safe and the loading of large documents SHOULD each require not more than a minute.
REQ_B05: The safe MUST allow the versioning of all data and documents stored in it. Proposed Solution
Systems architecture conceptual view Based on the requirements outlined above a prototype is described that supports an application for a place in an after-school care club. The conventional application form requires data about the school kids, the parents, their address and many more information up to their financial situation. The form comprises multiple pages so it is well understood that the electronic safe creates a real advantage over the conventional form in an e-government portal.
To integrate the electronic safe into current e-government portals we assume the base components shown in Fig. 9.
Additionally to common infrastructure components like a case management system and the archive storing case data and documents, we find a modern process management solution connected to a portal, that the citizen may use as entry point to initialize administrational services. The process management incorporates services provided by electronic safe integration components.
Trusted viewer
There is a certain demand to avoid dispersal of sensitive, private information across various IT systems. However, for current e-Government scenarios citizens have to provide all necessary data as unprotected clear text even if they are just used for visa. The future usage of these data is beyond citizens' control. The Trusted Viewer is dedicated hard- and software that permits the Safe User to view Safe content while preserving Safe-Owners control over their data.
According to REQ_S04, it disallows any data manipulation, duplication or storage without consent of the Safe Owner. Since it operates on a trusted computing infrastructure the Trusted Viewer can provide proof of a well-defined state and that it will perform as expected. Our Safe will deliver the Safe-Owners data only after verifying that proof. So, the Safe-Owner is in control of her sensitive data which does not disperse around. Seen from the process modeling view, the Trusted Viewer is a desktop where the public officer executes human interaction tasks that require working with real sensitive data.
An alternate option is to separate the case data from
identifying items and to use pseudonyms instead. The Safe Owner is able to proof the ownership of a pseudonym. That's why it is reasonable to decide applications based on
pseudonyms in favor of Safe Owner's privacy. However the Trusted Viewer is the preferable way, because it is difficult to rule whether data are identifying or not. Safe Request Management
As each of the data retrievals includes multiple requests to the Safe Provider and the Storage Providers this complex protocol is encapsulated within the Safe request management. The Safe request management is used by each of the clients, i.e. the Trusted Viewer(s) and the Rule Engine.
Rule Engine The Rule Engine is capable to execute simple tests on the data retrieved from the electronic safe. A certain amount of incoming applications could be pre-evaluated by this rule engine without any human involvement. This could potentially ease the human interaction tasks later on. Additionally it might be viable to let the rule engine work on the real sensitive data and to blind these data in following steps where humans are involved.
Certificate Issuance
We work with two types of certificates: X.509 certificates and idemix certificates. The X.509 certificates are used throughout the communication, whenever the communicating parties need to know each other, e.g. to sign release order requests or to hide the
communicating parties from the Safe Provider by encrypting the requests and their answers.
The idemix certificates are used whenever the service
provider needs to authorize a client without revealing her identity. Such a case occurs when the Safe Owner wants to store some data with the Storage Provider. The Safe Owner has to proof that he has the right to store some data ("I have paid the subscription fee!") without telling who she is.
The Safe User needs a Certificate Issuance to be able to issue role based certificates. If the citizen releases some data, these data are encrypted for a particular receiver. The receiver might be a natural person or a role, e.g. the
"Single Point of Contact" mentioned in the European Services Directive .
Data structures
The model of the data follows the idea, that the valuable information, e.g. a document, is split into separate chunks of bytes that are stored in little "buckets" with different Storage Providers. The information, which chunks belong to which document, is encrypted and stored with the Safe
Provider. The buckets stored with the Storage Provider are secured against unauthorized access by cryptographic means, so called idemix domain pseudonyms.
The information model in its central classes is documented in (Penski 2010) . Safe User Communication protocols The splitting of each information into chunks of bytes that are stored with independent Storage Providers involves even the release order requests. The diagram in Fig. 10 abstracts from necessary authentication and authorization procedures with Safe Provider and Storage Providers. Starting point is the Public Officer that works on a certain application of a person who owns an electronic Safe. The officer specifies in his Trusted Viewer which information is needed and which Safe should be used. The Safe Request Management will first retrieve the public key of the Safe Owner. This key is used later on to encrypt the information, where the release order request is stored and how the Safe Owner client is able to reassemble the information. Then the release order is split into several parts that are subsequently stored with
independent Storage Providers. Each of these steps requires a new authorization sequence because the Safe Request
Management remains completely anonymous in relation to the Storage Provider. The Storage Provider is not able to
discover whether two subsequently arriving requests are sent from the same Safe User. As soon as all parts are stored in buckets, the list of all buckets, ordered in their correct sequence is encrypted with the public key of the Safe Owner. The encrypted bucket list is stored with the Safe Provider. Finally, as a convenience service the Safe Provider informs the Safe Owner that "something" is arrived.
This is only a very small part of the entire communication protocol, however the underlying principle should be visible: all communicating parties exchange lists of buckets that contain pointers to the real information. If the Safe
Provider is capable to break the encryption of this
information (probably after a period of six to ten years) he is not in the position to have access on the buckets stored with the Storage Providers. Since the buckets are relocated from time to time, the information gained from breaking the encryption is useless. Conclusion
We presented embodiments to incorporate sensitive personal information in modern e-government processes. This was focused on the Safe User infrastructure.
The risk of data leakage caused by the operating staff of a single online storage provider is accomplished by the
separation of responsibilities and the dispersal of data to independent Storage Providers. The risk of unauthorized access to the Safe content is minimized by domain pseudonyms stored together with the data. As the weakest point is the client by which the sensitive information is displayed, the Trusted Viewer and the Safe Client of the Safe Owner are secured by trusted computing mechanisms (Trusted Computing Group, Incorporated 2007) . The Risk of a superior instance, controlling all the Storage Providers and the Safe Providers can be best reduced by an easy way to offer and incorporate new independent Storage Services, which are not controllable by a superior instance.
As a result we achieved a trustworthy decentralized e- government infrastructure component by splitting
responsibilities on multiple organizational units and by combining privacy enhancing technologies with trusted
computing and information dispersal.
References Albert Jr., W.P. and Kotzin M. 2000, "SMART CARD WITH BACK UP", United States patent number 7206847, Filing date 22nd Mai 2000, Issue date 17th Apr.2007.
Cacayorin, P. (2004), "Dispersed data storage using
cryptographic scrambling description" [online] ,
http : //www . freshpatents . com/Dispersed-data-storage-using- cryptographic-scrambling-dt20060413ptan20060078127.php, USPTO Patent Application 20060078127, Filing date 8 Oct. 2004. Camenisch, J. and Herreweghen, E.V. (2002) "Design and implementation of the idemix anonymous credential system", CCS '02: Proceedings of the 9th ACM conference on Computer and communications security, New York, NY, USA, ACM Press, pp 21-30.
Chaum, D. (1985) "Security without identification:
Transaction systems to make big brother obsolete",
Communications of the ACM 28(10), ppl030-1044.
DIRECTIVE 2006/123/EC OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL of 12 December 2006 on services in the internal market, available at:
http : / /ec . europa . eu/ internal_market / services/ services- dir/ index_en . htm
Iyengar, A., Cahn, R. Garay, J. A., Jutla, G. (1998), "Design and Implementation of a Secure Distributed Data Repository", In Proc. of the 14th IFIP Internat. Information Security Conf . , pp.123—135.
Kubiatowicz, Bindel, Chen, Eaton, Geels, Gummadi, Rhea,
Weatherspoon, Weimer, Wells, and Zhao (2000), "Oceanstore: An architecture for global-scale persistent storage", Proc. of the ninth international conference on Architectural support for programming languages and operating systems (ASPLOS '00), ACM SIGARCH Computer Architecture News, Dec. 2000, pp.190- 201, ISSN: 0163-5964 .
Paul, A., Agarwala, S. and Ramachandran, U. (2007), "e-SAFE: An Extensible, Secure and Fault Tolerant Storage System", Proc. IEEE Self-Adaptive and Self-Organizing Systems, (SASO Ό7), IEEE Press, , pp. 257-268, doi: 10.1109 /SASO .2007.21
Penski, A. and Breitenstrom, C. (2010) in Press, "Electronic safes: information model and security approach", Proc. of the 2nd International Conference on Networks Security, Wireless Communications and Trusted Computing (NSWCTC 2010)
Rabin, M (1989), "Efficient dispersal of information for security, load balancing, and fault tolerance", Journal of the ACM, Vol.36, pp. 335-348. Schneier, B. (2009) "Crypto-Gram-Newsletter", [online], http : / /www . schneier . com/crypto-gram-0901. html .
Redlich, R.M. and Nemzow, M . A . (2008), "DATA SECURITY SYSTEM AND METHOD WITH PARSING AND DISPERSION TECHNIQUES", United States patent number 7349987, Filing date 23 May 2002, Issue date 25 Mar 2008.
Trusted Computing Group, Incorporated (2007), "TCG Software Stack (TSS) Specification Version 1.2",
http : / /www . trustedcomputinggroup . org/resources/tcg_software_s tack_tss_specification .
Zhang, Zhang, Xian, Chen, Feng, (2008), „Towards A Secure Distribute Storage System", Advanced Communication
Technology, (ICACT 2008), IEEE Press, Apr. 2008, pp. 1612- 1617, doi: 10.1109/ICACT .2008.4494090

Claims

Claims
1. Method for safely handling at least one dataset, in
particular a document (10, 20), particularly in a cloud computer environment, wherein
a) the at least one dataset (10, 20) is partitioned into at least two dataset partitions (1, 2, 3, 4),
b) the at least two dataset partitions (1, 2, 3, 4) are stored on and / or retrieved from at least two computer sites (100, 200, 300, 400) being part of, e.g., a cloud computer environment, so that on no computer site (100, 200, 300, 400) sufficient, in particular all, dataset partitions (1, 2, 3, 4) of the at least one dataset (10, 20) are present, in particular not present at the same time .
2. Method for safely handling at least one dataset, in
particular a document (10, 20), in, e.g., a cloud computer environment, wherein a retrieval of a
partitioned dataset (1, 2, 3, 4) to combine the at least one dataset (10, 20) comprises
a) an authorization by a user, in particular an owner (1000) of the at least one dataset (10, 20) at a safe system (1010),
b) on positive completion of the authorization,
automatic retrieval of distribution information
regarding the dataset partitions (1, 2, 3, 4) from the safe system (1010),
c) retrieval of all dataset partitions (1, 2, 3, 4) of the at least one dataset (10, 20) from computer sites (100, 200, 300, 400),
d) transformation of the dataset partitions (1, 2, 3, 4) into a logical representation of the at least one dataset (10, 20) .
Method for safely handling at least one dataset, in particular a document (10, 20), in, e.g., a cloud computer environment, wherein a storing of a partitioned dataset (1, 2, 3, 4) derived from the at least one dataset (10, 20) comprises
a) an authorization by an owner (1000) of the at least one dataset (10, 20) at a safe system (1010),
b) on positive completion of the authorization,
automatic transfer of distribution information regarding the dataset partitions (1, 2, 3, 4) to the safe system (1010) ,
c) automatic storing all dataset partitions (1, 2, 3, 4) of the at least one dataset (10, 20) on computer sites (100, 200, 300, 400) as physical sub-representations of the at least one dataset (10, 20), so that none of the computer sites (100, 200, 300, 400) comprises more than one instance of the dataset partitions (1, 2, 3, 4), in particular not at the same time.
Method according to at least one of the preceding claims, wherein the methods of claim 1, 2 and / or 3 are executed on a system under control of the owner (1000) of the dataset (10, 20), in particular on a mobile device (1001) .
Method according to at least one of the preceding claims, wherein the distribution-information regarding the document partitions (1, 2, 3, 4) is stored in a safe system (1010) .
Method according to at least one of the preceding claims, wherein the at least one dataset (10, 20) is partitioned and / or the document (10, 20) is recombined on a client under control of the owner (1000) of the at least one dataset (10, 20) .
7. Method according to at least one of the preceding claims, wherein the at least one dataset (10, 20) is automatically partitioned according to a predefined set of rules, in particular into dataset partitions (1, 2, 3, 4) of equal size or according to a secret sharing algorithm of Rabin.
8. Method according to any of the preceding claims, wherein each dataset partition (1, 2, 3, 4) is automatically associated with an distribution-information, in
particular a distribution tag (11, 12, 13) .
9. Method according to any of the preceding claims, wherein the dataset partitions (1, 2, 3, 4) are at least
partially automatically encrypted.
10. Method according to any of the preceding claims, wherein at least for a subset of the at least two dataset partitions (1, 2, 3, 4) the storage site (100, 200, 300, 400) dynamically changed.
11. Method according to any of the preceding claims, wherein communication with the safe system (1010) and / or storage provider is at least partially anonymous.
12. Method according to any of the preceding
claims, wherein the distribution information is safely transmitted to a further user of the dataset (10, 20) .
13. Method according to any of the preceding claims, wherein at least one user of a safe system (1010) is a public administration.
14. System for safely handling at least one dataset, in particular a document (10, 20), in, e.g., a cloud computer environment, with a) a partition means to partition the at least one dataset (10, 20) into at least two dataset partitions (1, 2, 3, 4),
b) a safe system (1010) for storing / and or retrieving the at least two dataset partitions (1, 2, 3, 4) from and / or to at least two computer sites (100, 200, 300, 400) being part of, e.g., a cloud computer environment, so that none computer site (100, 200, 300, 400)
comprises instances of all dataset partitions (1, 2, 3, 4) of the at least one dataset (10, 20), in particular not at the same time.
15. System for safely handling at least one dataset, in particular a document (10, 20), in, e.g., a cloud computer environment, with retrieval means for a
partitioned dataset (1, 2, 3, 4) to combine the at least dataset (10, 20), characterized by, an authorization means for a user, in particular an owner (1000) of the at least one dataset (10, 20) at a safe system (1010) and a retrieval means which, on positive completion of the authorization, uses
distribution information regarding the dataset
partitions (1, 2, 3, 4) to automatically retrieve from the, e.g., cloud computer environment all dataset partitions (1, 2, 3, 4) of the at least one dataset (10, 20) from computer sites (100, 200, 300, 400), and the dataset partitions (1, 2, 3, 4) are transformed into a logical representation of the at least one dataset (10, 20) .
16. System for safely handling at least one dataset, in particular a document (10, 20), in, e.g., a cloud computer environment with storing means for a partitioned dataset (1, 2, 3, 4) derived from the at least one dataset (10, 20), characterized by, an authorization means for an owner (1000) of the at least one dataset (10, 20) at a safe system (1010) and a storing means which, on positive completion of the authorization, uses automatically generated distribution information regarding the dataset partitions (1, 2, 3, 4) to automatically store all dataset partitions (1, 2, 3, 4) of the at least one dataset (10, 20) on computer sites (100, 200, 300, 400) as physical representations of the at least one dataset (10, 20), so that none of the computer sites (100, 200, 300, 400) comprises more than one instance of the dataset partitions (1, 2, 3, 4), in particular not at the same time and to store the distribution information at the safe system (1010).
PCT/EP2011/059846 2010-06-14 2011-06-14 Methods and systems for securely handling datasets in computer systems WO2011157708A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
EP10075257 2010-06-14
EP10075257.5 2010-06-14
DE102010023894 2010-06-16
DE102010023894.5 2010-06-16

Publications (1)

Publication Number Publication Date
WO2011157708A1 true WO2011157708A1 (en) 2011-12-22

Family

ID=44628152

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2011/059846 WO2011157708A1 (en) 2010-06-14 2011-06-14 Methods and systems for securely handling datasets in computer systems

Country Status (2)

Country Link
DE (2) DE102011077512A1 (en)
WO (1) WO2011157708A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102843381A (en) * 2012-09-14 2012-12-26 江苏乐买到网络科技有限公司 Cloud computing system capable of guaranteeing account security
WO2013131273A1 (en) * 2012-03-09 2013-09-12 Empire Technology Development Llc Cloud computing secure data storage
WO2016093918A2 (en) 2014-11-03 2016-06-16 CRAM Worldwide, Inc. Secured data storage on a hard drive
WO2018023144A1 (en) * 2016-08-04 2018-02-08 Ait Austrian Institute Of Technology Gmbh Method for checking the availability and integrity of a data object stored in a distributed manner
WO2019129642A1 (en) * 2017-12-31 2019-07-04 Bundesdruckerei Gmbh Secure storage of and access to files through a web application
EP3591930A4 (en) * 2017-03-03 2020-01-22 Tencent Technology (Shenzhen) Company Limited Information storage method, device, and computer-readable storage medium
US10735529B2 (en) 2017-12-07 2020-08-04 At&T Intellectual Property I, L.P. Operations control of network services

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050240749A1 (en) * 2004-04-01 2005-10-27 Kabushiki Kaisha Toshiba Secure storage of data in a network
US20060078127A1 (en) 2004-10-08 2006-04-13 Philip Cacayorin Dispersed data storage using cryptographic scrambling
US7206847B1 (en) 2000-05-22 2007-04-17 Motorola Inc. Smart card with back up
WO2007133791A2 (en) * 2006-05-15 2007-11-22 Richard Kane Data partitioning and distributing system
US20080060085A1 (en) * 2006-03-10 2008-03-06 Jan Samzelius Protecting Files on a Storage Device from Unauthorized Access or Copying
US7349987B2 (en) 2000-11-13 2008-03-25 Digital Doors, Inc. Data security system and method with parsing and dispersion techniques

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7206847B1 (en) 2000-05-22 2007-04-17 Motorola Inc. Smart card with back up
US7349987B2 (en) 2000-11-13 2008-03-25 Digital Doors, Inc. Data security system and method with parsing and dispersion techniques
US20050240749A1 (en) * 2004-04-01 2005-10-27 Kabushiki Kaisha Toshiba Secure storage of data in a network
US20060078127A1 (en) 2004-10-08 2006-04-13 Philip Cacayorin Dispersed data storage using cryptographic scrambling
US20080060085A1 (en) * 2006-03-10 2008-03-06 Jan Samzelius Protecting Files on a Storage Device from Unauthorized Access or Copying
WO2007133791A2 (en) * 2006-05-15 2007-11-22 Richard Kane Data partitioning and distributing system

Non-Patent Citations (15)

* Cited by examiner, † Cited by third party
Title
"TCG Software Stack", TRUSTED COMPUTING GROUP, 2007
CAMENISCH, J., HERREWEGHEN, E.V.: "CCS '02: Proceedings of the 9th ACM conference on Computer and communications security", 2002, ACM PRESS, article "Design and implementation of the idemix anonymous credential system", pages: 21 - 30
CHAUM, D.: "Security without identification: Transaction systems to make big brother obsolete", COMMUNICATIONS OF THE ACM, vol. 28, no. 10, 1985, pages 1030 - 1044, XP002000086, DOI: doi:10.1145/4372.4373
CHAUM, D.: "Security without identification: Transaction systems to make big brother obsolete", COMMUNICATIONS OF THE ACM, vol. 28, no. 10, pages 1030 - 1044, XP002000086, DOI: doi:10.1145/4372.4373
IYENGAR, A ET AL.: "Design and Implementation of a Secure Distributed Data Repository", IN PROC. OF THE 14TH IFIP INTERNAT. INFORMATION SECURITY CONF., 1998, pages 123 - 135
IYENGAR, A., CAHN, R., GARAY, J.A., JUTLA, G.: "Design and Implementation of a Secure Distributed Data Repository", PROC. OF THE 14TH IFIP INTERNAT. INFORMATION SECURITY CONF., 1998, pages 123 - 135
KUBIATOWICZ ET AL.: "Oceanstore: An architecture for global-scale persistent storage", PROC. OF THE NINTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS '00), ACM SIGARCH COMPUTER ARCHITECTURE NEWS, December 2000 (2000-12-01), pages 190 - 201
KUBIATOWICZ, BINDEL, CHEN, EATON, GEELS, GUMMADI, RHEA, WEATHERSPOON, WEIMER, WELLS: "Oceanstore: An architecture for global-scale persistent storage", PROC. OF THE NINTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS '00), ACM SIGARCH COMPUTER ARCHITECTURE NEWS, December 2000 (2000-12-01), pages 190 - 201
PAUL, A. ET AL.: "Proc. IEEE Self-Adaptive and Self-Organizing Systems,(SASO '07", 2007, IEEE PRESS, article "e-SAFE: An Extensible, Secure and Fault Tolerant Storage System", pages: 257 - 268
PAUL, A., AGARWALA, S., RAMACHANDRAN, U.: "Proc. IEEE Self-Adaptive and Self-Organizing Systems, (SASO '07", 2007, IEEE PRESS, article "e-SAFE: An Extensible, Secure and Fault Tolerant Storage System", pages: 257 - 268
PENSKI, A., BREITENSTROM, C.: "Electronic safes: information model and security approach", PROC. OF THE 2ND INTERNATIONAL CONFERENCE ON NETWORKS SECURITY, WIRELESS COMMUNICATIONS AND TRUSTED COMPUTING (NSWCTC 2010, 2010
RABIN, M: "Efficient dispersal of information for security, load balancing, and fault tolerance", JOURNAL OF THE ACM, vol. 36, 1989, pages 335 - 348, XP000570108, DOI: doi:10.1145/62044.62050
SCHNEIER, B., CRYPTO-GRAM-NEWSLETTER, 2009, Retrieved from the Internet <URL:http://www.schneier.com/crypto-gram-0901.html>
ZHANG ET AL.: "Advanced Communication Technology, (ICACT 2008", April 2008, IEEE PRESS, article "Towards A Secure Distribute Storage System", pages: 1612 - 1617
ZHANG, ZHANG, XIAN, CHEN, FENG: "Advanced Communication Technology, (ICACT 2008", April 2008, IEEE PRESS, article "Towards A Secure Distribute Storage System", pages: 1612 - 1617

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013131273A1 (en) * 2012-03-09 2013-09-12 Empire Technology Development Llc Cloud computing secure data storage
US9225770B2 (en) 2012-03-09 2015-12-29 Empire Technology Development Llc Cloud computing secure data storage
US9882917B2 (en) 2012-03-09 2018-01-30 Empire Technology Development Llc Cloud computing secure data storage
CN102843381A (en) * 2012-09-14 2012-12-26 江苏乐买到网络科技有限公司 Cloud computing system capable of guaranteeing account security
EP3215927A4 (en) * 2014-11-03 2018-07-04 Secured2 Corporation Secured data storage on a hard drive
WO2016093918A2 (en) 2014-11-03 2016-06-16 CRAM Worldwide, Inc. Secured data storage on a hard drive
JP2019523458A (en) * 2016-08-04 2019-08-22 エーアイティー オーストリアン インスティテュート オブ テクノロジー ゲゼルシャフト ミット ベシュレンクテル ハフツングAIT Austrian Institute of Technology GmbH A method for checking the availability and integrity of distributed data objects
WO2018023144A1 (en) * 2016-08-04 2018-02-08 Ait Austrian Institute Of Technology Gmbh Method for checking the availability and integrity of a data object stored in a distributed manner
US10884846B2 (en) 2016-08-04 2021-01-05 Ait Austrian Institute Of Technology Gmbh Method for checking the availability and integrity of a distributed data object
JP7116722B2 (en) 2016-08-04 2022-08-10 エーアイティー オーストリアン インスティテュート オブ テクノロジー ゲゼルシャフト ミット ベシュレンクテル ハフツング Methods for checking the availability and integrity of distributed data objects
EP3591930A4 (en) * 2017-03-03 2020-01-22 Tencent Technology (Shenzhen) Company Limited Information storage method, device, and computer-readable storage medium
US10735529B2 (en) 2017-12-07 2020-08-04 At&T Intellectual Property I, L.P. Operations control of network services
US11277482B2 (en) 2017-12-07 2022-03-15 At&T Intellectual Property I, L.P. Operations control of network services
US11659053B2 (en) 2017-12-07 2023-05-23 At&T Intellectual Property I, L.P. Operations control of network services
WO2019129642A1 (en) * 2017-12-31 2019-07-04 Bundesdruckerei Gmbh Secure storage of and access to files through a web application
US11675922B2 (en) 2017-12-31 2023-06-13 Bundesdruckerei Gmbh Secure storage of and access to files through a web application

Also Published As

Publication number Publication date
DE102011077513A1 (en) 2012-08-23
DE102011077512A1 (en) 2012-03-01

Similar Documents

Publication Publication Date Title
US10348700B2 (en) Verifiable trust for data through wrapper composition
Samarati et al. Cloud security: Issues and concerns
EP2396922B1 (en) Trusted cloud computing and services framework
EP2396921B1 (en) Trusted cloud computing and services framework
US8468345B2 (en) Containerless data for trustworthy computing and data services
US20040022390A1 (en) System and method for data protection and secure sharing of information over a computer network
CN105516110A (en) Mobile equipment secure data transmission method
WO2011157708A1 (en) Methods and systems for securely handling datasets in computer systems
CN108701094A (en) The safely storage and distribution sensitive data in application based on cloud
EP3395004B1 (en) A method for encrypting data and a method for decrypting data
Guo et al. Using blockchain to control access to cloud data
Yu et al. Data security in cloud computing
KR100286904B1 (en) System and method for security management on distributed PC
CN110268693A (en) VNF packet signature system and VNF packet signature method
Zawawi et al. Realization of a data traceability and recovery service for a trusted authority service co-ordination within a Cloud environment
Zeidler et al. Towards a framework for privacy-preserving data sharing in portable clouds
Adlam et al. Applying Blockchain Technology to Security-Related Aspects of Electronic Healthcare Record Infrastructure
Breitenstrom et al. Electronic safes for process oriented eGovernment
Munir Security model for mobile cloud database as a service (DBaaS)
Catuogno et al. Improving Interoperability in Multi-domain Enterprise Right Management Applications
Kowalski CRYPTOBOX V2.
Vrielynck A decentralized access control and resource delegation framework
Jo et al. Secure and Lightweight Access Control for Highly Decentralized and Distributed File Systems
Velmurugan et al. Provably secure data selective sharing scheme with cloud-based decentralized trust management systems
Neuhaus et al. A Dependable and Secure Authorisation Service in the Cloud.

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11730246

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11730246

Country of ref document: EP

Kind code of ref document: A1