WO2004077215A2 - System and method for data migration and conversion - Google Patents

System and method for data migration and conversion Download PDF

Info

Publication number
WO2004077215A2
WO2004077215A2 PCT/IN2004/000026 IN2004000026W WO2004077215A2 WO 2004077215 A2 WO2004077215 A2 WO 2004077215A2 IN 2004000026 W IN2004000026 W IN 2004000026W WO 2004077215 A2 WO2004077215 A2 WO 2004077215A2
Authority
WO
WIPO (PCT)
Prior art keywords
data
source
migration
user
target
Prior art date
Application number
PCT/IN2004/000026
Other languages
French (fr)
Other versions
WO2004077215A3 (en
Inventor
Vinayak K. Rao
Original Assignee
Vaman Technologies (R & D) Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vaman Technologies (R & D) Limited filed Critical Vaman Technologies (R & D) Limited
Publication of WO2004077215A2 publication Critical patent/WO2004077215A2/en
Publication of WO2004077215A3 publication Critical patent/WO2004077215A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database

Definitions

  • Newer attributes were added to define these data types and associate their persistent form requirements such as scale, precision, field width, formats for representation etc. Scale and precision specified bytes consumed before and after decimal places. Formats of representation of data also got standardized as same data when persisted in various format could lead to data misinterpretation. For example date/month/year (dd/mm/yy) wherein date information with values of dates lesser than 13 could be misinterpreted if formats were changed from date/month/year (dd/mm/yyyy) to month/date/year (mm/dd/yyyy).
  • middleware which allows data migration and conversion as well as mapping and exchange of legacy data without disturbing the legacy application.
  • middleware is very expensive and requires tremendous reprogramming to achieve the desired results.
  • the user or person carrying out the migration needs to have a detailed understanding of both the source as well as the target system for doing valid migration and mapping.
  • system or method which can enable data migration in a cost effective manner, without having an in-depth knowledge of the source and target systems.
  • the present invention provides a software-implemented process, system, and method for use in a computing environment.
  • the preferred embodiment of the present invention provides a powerful mechanism to transfer data (i.e. Tables) between heterogeneous databases which could either be ODBC / OLEDB / JDBC compliant or non-compliant. It makes it easy to import, export and transform data between any popular data formats including binary, text, databases etc.
  • the present invention provides is a data migration tool used to transfer data between heterogeneous databases and also the file loader is used to import data from flat files.
  • the present invention is a data migration tool, which allows mapping and extracting of any pattern of data irrespective of source of generation of the data pattern, to a target ODBC / OLEDB / JDBC complaint or any user defined pattern.
  • This entire processes maybe a one time or a periodic effort, which can be configured as a template and scheduled as a job.
  • the data migration process consisting mainly of three phases Pre-migration, Data Migration Service, Post migration.
  • the Data Migration of the present invention resolves the nearest data type relations. Resolution is based on SQL data types defined by ODBC / OLEDB / JDBC and the target data type expected without compromising the data value or relational integrity.
  • the Non-ODBC/Non-OLEDB/Non-JDBC compliant could be a Text file, a DBMS file, a Binary file or a User Defined Pattern of data based on the interpretation of control character data like, ctrl, carriage return in the file.
  • the data source is Non-ODBC compliant it could be binary, text or preprocesses command. If the data source is ODBC compliant then the user selects the source (to copy data from) ODBC driver for which the Data Source Name (DSN) consisting of appropriate parameters like server name, database user, etc is checked for existence.
  • DSN Data Source Name
  • the data migration tool translates the next data record and then reads the data as per the DDL.
  • FIG 1 is a block diagram depicting the classification of the source of persistent data as Relational or Non-relational. Further the Non-relational could be Binary, Text, DBMS file with single or multiple DDL. The data after transformation and cleaning is in the target Database.
  • FIG 2. is a block diagram depicting the Migration process consisting mainly of three phases., the Pre- Migration (PRE), actual Migration (DMS) and Post Migration (POST).
  • FIG. 3 is a block diagram depicting the functional blocks of the preferred embodiments of the present invention.
  • FIG. 4 is an Object matrix
  • FIG. 5 (pages a and b) is a flow chart illustrating preferred embodiment showing the mechanism for Migration from one ODBC compliant format to another ODBC compliant format using the Data Migration Tool in accordance with the present invention.
  • FIG. 6 is a flow chart illustrating an alternative embodiment of the present invention, depicting the mechanism for Migration from text file to ODBC compliant format using the Data Migration Tool in accordance with the present invention.
  • FIG. 7 is a flow chart illustrating alternative embodiment depicting the mechanism for Migration from binary file to ODBC compliant format using the Data Migration Tool in accordance with the present invention.
  • Fig. 8 illustrates the first step of transfer of data between Heterogeneous Databases using Data Migration Wizard .
  • FIG. 9 illustrates the second step of transfer of data between Heterogeneous Databases using Data Migration Wizard showings selection for ODBC driver, DSN user Id and Password.
  • Fig. 10 illustrates the third step of transfer of data between Heterogeneous Databases using Data Migration Wizard depicting the tables to choose for migration.
  • Fig. 11 illustrates the fourth step of transfer of data between Heterogeneous Databases using Data Migration Wizard depicting the destination ODBC driver, DSN, userid and password.
  • Fig. 12 illustrates the fifth step of transfer of data between Heterogeneous Databases using Data Migration Wizard depicting the selecting the options for migration.
  • Fig. 13 illustrates the sixth step of transfer of data between Heterogeneous Databases using Data Migration Wizard showing the option to transform the column information.
  • Fig. 14 illustrates the sixth step of transfer of data between Heterogeneous Databases using Data Migration Wizard depicting the selected columns and available columns.
  • Fig. 15 illustrates the sixth step of transfer of data between Heterogeneous Databases using Data Migration Wizard for reporting the data Migration Progress.
  • Fig. 16 illustrates the present invention for importing data from Flat Files using a File Loader
  • Fig. 17 illustrates the present invention for importing data from Flat Files using a File Loader
  • Fig. 18 illustrates the present invention for importing data from Flat Files using a File Loader
  • Fig. 19 illustrates the present invention for importing data from Flat Files using a File Loader
  • Fig. 20 illustrates the present invention for importing data from Flat Files using a File Loader
  • Fig. 20 illustrates the present invention for importing data from Flat Files using a File Loader
  • Fig. 21 illustrates the present invention for importing data from Flat Files using a File Loader
  • Fig. 22 illustrates the present invention for importing data from Flat Files using a File Loader
  • Fig. 23 illustrates the present invention for importing data From Flat files using a File Loader
  • Fig. 24 illustrates the present invention for importing data from Flat Files using a File Loader
  • Fig. 25 illustrates the present invention for importing data from Flat Files using a File Loader
  • Fig. 26 illustrates the present invention for importing data from Flat Binary Files using a File Loader
  • Fig. 27 illustrates the present invention for importing data from Flat Binary Files using a File Loader
  • Fig. 28 illustrates the present invention for importing data from Flat Binary Files using a File Loader
  • Fig. 29 illustrates the present invention for importing data from Flat Binary Files using a File Loader
  • Fig. 30 illustrates the present invention for importing data from Flat Binary Files using a File Loader
  • Fig. 31 illustrates the present invention for importing data from Flat Binary Files using a File Loader
  • FIG. 1 there is shown a block diagram depicting the classification of the persistent data 100 irrespective of the source of generation as ODBC/OLEDB/JDBC compliant 105 and Non-ODBC/Non- OLEDB/Non-JDBC compliant data-files 110.
  • These generation data sources are endian neutral, also independent of operating systems such as Windows, Linux, MacOS, Unix etc and language, etc.
  • the Non-ODBC/Non-OLEDB/Non-JDBC compliant (flat files) 110 could be a Text file 115, a DBMS file 120, a Binary file 125 or a User Defined Pattern of data 127 based on the interpretation of control character data like , Ctrl, carriage return in the file.
  • DBMS file 120 a Binary file 125 or a User Defined Pattern of data 127 based on the interpretation of control character data like , Ctrl, carriage return in the file.
  • Example Disk Operating System can handle 640 files.
  • the Text or Binary files can be multiple data definition language or single data definition language.
  • Non-ODBC/Non-OLEDB/Non-JDBC compliant files could contain multiple Data Definition Language (Multiple DDL) 130 or single Data Definition language (single DDL) 135. Accordingly the transformation and cleansing are applied to these files 140 and stored in the target database 145.
  • Multiple DDL Multiple Data Definition Language
  • single DDL single Data Definition language
  • the present invention further classifies this text data stream into a single DDL file 135 or a multiple DDL file 130 i.e. whether the data stream has patterns of data associated with more than one definition.
  • the Text Data File 115 is one which has data separated by a delimiter. Many text files have the classified delimiter, if it has one delimiter then the file is opened and data is read line by line.
  • the DBMS data file 120 could further be classified as Single DDL 135 or Multiple DDL 130 depending on the interpretation of control character present in this stored in this pattern of data.
  • This data file consists of database management objects like database(s), table(s), etc.
  • the Binary File 125 can also be Multiple DDL 130 or Single DDL 135.
  • the Binary File 125 generally has contiguous blocks of data tuple(s) and there may or may not contain a record delimiter. Many binary files may have an embedded DDL as part of the header definition as in the case of dbase file (DBF) files.
  • User-defined patterns of data 127 are supported irrespective of source of data. These user - defined data files can be classified as having a single DDL 135 or a Multiple DDL 130 as per the interpretation of control characters. The data is stored in such files as patterns of data. But data files created from proprietary sources may have a preamble header, which may not define the accompanying data. The data definition should perfectly specify the header size, the start of data block, the pattern of data expected in a single data byte stream. Many binary data streams which support variable size data have column data length as prefix header for each column data.
  • the present invention allows part of the process of cleansing and transformation during the process of extraction itself.
  • any invalid or null entry can be replaced by a predefined default value or any transformation of the column data content such as case changes or format representation can be performed and these translations can be a part of DDL definition itself.
  • Other issues of constraints relating to data validity and occurrences can also be defined using indexing parameters of primary / composite key, uniqueness or duplicates allowed etc.
  • the present invention allows scripting or programming to be operated on data and user defined error logic to support intelligent decision making during the process of extraction or transformation. Also based on specific values of data encountered while extracting, intelligent decisions can be made and data routed or value based transform can be applied. Intelligent decision support systems can also be triggered when loading the finally transformed data and business intelligence can be derived. Whenever data generated has not been persisted in any of the classified data types there exists a need to convert the data stream into a corresponding data type. The conversion process may require a translation of format if the source and target data have different representation formats.
  • FIG. 2 there is shown is a block diagram of the a Data Migration process consisting mainly of three phases Pre-migration 200, Data Migration Service 205, Post migration 210.
  • the Data Migration Service 205 further consist mainly of four phases, which includes Extraction 215, Transformation 220, Cleansing and 225 Loading 230 (ETCL 235).
  • Pre-migration 200 is the first phase in migration, which indicates how the user will send the data that is to be migrated, gathering or importing all the data for migration, understanding the Application, use of the data and nature of data to be extracted. If not available/transferable locally, the data can be sent as a compressed file through telnet or email. The cost factor comes into picture largely when one is importing the file through Integrated Services Digital Network (ISDN) lines, thereby incurring transmission costs. Gathering/importing all the data for migration may involve decompressing the received files that are transmitted in a compressed format.
  • ISDN Integrated Services Digital Network
  • Data Extraction 215 is the process of gathering data from multiple heterogeneous data sources. It involves reading data from one or more data sources irrespective of whether these data sources can be relational or non-relational (Flat file structures)
  • Transformation 220 is the process of converting data from legacy or host format to target format. In this stage, any changes to the data types of the extracted data can be effected so that data is in the required target format.
  • the data that has been extracted is scanned for any errors that may have occurred during the extraction process. These errors may be due to missing, incomplete data, noisy data containing errors or values that deviate from the expected, unwanted data that is part of the source data but not required in the destination etc. These errors can be handled automatically or as per the user specifications.
  • This Loading phase 230 of the data migration process involves sorting, summarizing and consolidating the data. During this phase the data is in finished format compatible with the destination. Constraints may be built on the data like integrity checks, views may be computed and the data can be partitioned and indexed to optimise its management. By the end of this phase the data is ready to be transferred to the destination.
  • the third phase referred to as POST-Migration 210 involves exporting the transformed data back to the user. Depending on its size, the data is compressed and sent to the user through the desired channel. For security purpose the target file can be encrypted to the destination. Again, this phase can be compared to the delivery of finished goods in a manufacturing scenario.
  • Fig. 3 is a block diagram depicting the functional blocks of the preferred embodiment of the present invention.
  • the Network Interface 302 helps the data migration tool of the present invention to connect to the source data by either linking or mapping files based on raw DataStream consisting of patterns of data.
  • the ODBC interface 300 is another source data ports which help the data migration tool of the present invention to connect to the source data which is ODBC compliant using an ODBC compliant driver.
  • the User Interface layer 305 is an interactive intuitive tool which helps user to define their own data scripts, constraints, translation / transformation functionality without delving into syntactical requirements.
  • non-ODBC data sources may need to be parsed (such as XML data) or a simple delimited CSV (Comma separated values) data may require primitive parsing.
  • the Parsing Engine 310 parses only script files such as an XML file. In many cases where the user defines scripts or event handlers, the input may have to be validated for syntactical errors. The Parsing Engine 310 executes these responsibilities.
  • the Data Dictionary 315 is an object definition and a constraint repository, which is created as soon as the source and target data sources are connected. Its primary function is to hold the object defining entities and the translating parameters between the source and target object which serves as a lookup table to all transforming, translating and mapping functional entities of the conversion and hence aiding the migration using the present invention.
  • the data dictionary allows to correctly predict an update operation since write in a fixed pattern legacy data either has to be at an integral multiple of record size or exactly at an offset position within a record which is an integral addition to its column position or column offset.
  • the Data Mapper 320 resolves the nearest data type relations. Resolution is based on SQL data types defined by ODBC and the target data type expected without compromising the data value or relational integrity. This process essentially involves gauging of the data width, scale, precision and other transformation aspects with respect to the source and target data type definitions implemented by database vendors. For example 'Currency' and 'Money' have the same functionality but their naming differentiate these data types. Hence the Data Mapper 320 maps 'Currency' to 'Money' in the Data Dictionary 315. This mapping is done earlier in the process when a connection is established from the source to the target.
  • the mapper uses the "best-fit” or “best-available” datatype mapping between source and target (only in case of mismatches or missing datatype in target database) using scale and precision mapping of the source and target datatypes so that the data width does not harm data loss or quality. In case a corresponding scale and precision datatype is found the nearest pattern of data (string to string, number to number or float or double or currency etc..) is mapped so that any conversion process from source column to target column is avoided.
  • the Data Translator 325 is an extension of data mapping functionality. When there exists no corresponding equivalent implementation of the source datatype in the target database. For example, data translator maps "texf datatype sources to "variable" datatype in target. The Data Translator 325 maps the most likely data type with corresponding attributes like scale, precision etc. These mapping of attributes can service this data functionality without compromising data handling, managing and manipulating capabilities. For example a variable character ('varchar') datatype in SQL source is 8k while in Oracle, the target varchar is 4k.
  • the Data Transformer 330 is a set of predefined logical, arithmetic and data manipulation functionalities, which help in data cleansing, formatting and loading. A typical example being the trimming of spaces before and after or padding of spaces etc. It also allows changing a datatype from ODBC compliant to non-ODBC compliant.
  • the Data Formatter 335 handles the presentation logic of the source data before being loaded to the target system. This generally is based on the constraints dictated by the rules of the target database. For example, in a relational database or an XSLT, if a XML data or any format in which the user dictated delimiter and representation of data, masks the source data output.
  • the Constraint Validator 340 is a functional translator, which works on equivalent functional features rather than data types like the data translator. It also verifies that the transformed data does not inherit any process loss, which violates the rules of target database.
  • the Script Engine 345 is a programmable language, execution engine like one found in 'MS Word', which can execute scripts similar to Visual Basic (VB) script, Javascript or a database Procedural language (PL) like Oracle.
  • the Script Engine 345 executes user-defined programming functionality to be operated on the source data for defining business logic. These generally maybe a set of commands executed upon and by data value rather than operations like trigger which work on target databases. This is useful for defining and accumulating data related business intelligence information and is an indispensable tool.
  • the Data Loader 350 chooses methods, depending upon initial data analysis.
  • the process of loading may require temporary storage or staged database, which is managed by the Data Loader 350.
  • the Event Handler 355 is a set of predefined call back functions designed to work with preset patterns of data and associated function operations with the set of data taking consideration of various variables and runtime parameters, which can be operative on that set pattern of data. In other words the Event Handler 355 carries out DDL and DML operations.
  • the Scheduler 360 as the name suggests is the primary messaging kernel, which controls and schedules various block functionalities as per request operation execution. In conjunction with the Scheduler 360, the Event Handler 355 can be used for communicating various instances of tools running across distributed network environment and evoking and executing remote operations.
  • FIG. 4 there is shown the process of deriving the Migration Capability Matrix, based on the adherence of ODBC drivers to standards and functionality supported by the database vendor. Hence a complete database migration may not be possible for the following reasons:
  • the drivers do not support all object migration i.e. the catalog of functions of the ODBC drivers do not reveal enough information for complete object migration.
  • the information revealed by driver maybe obfuscated.
  • Thirdly Architecture differences between the source database and target database may not support object migration. Fourthly even when same object types are supported across the source database and the target database the features supported by source vendor for the object may not be available for the target database. For example Calculated columns.
  • the Capability Matrix 400 isolates the above mentioned discrepancies and finds those object types 405, which can be migrated such as Tables, Index, Views etc.
  • the objects 410 are databases, tables, views, etc and the hierarchy of an object 410 is from a server to database, from database to table space, from table space to users, from users to table(s), from tables 420 to views or indexes 425 etc.
  • These hierarchies of object types 405 dictates the sequence of object creation i.e. a database has to be existent or a new database has to be created.
  • tablespaces are created as per source properties.
  • object types 405 which the source and target database support are broadly classified into object with data 430 or without data 435 that is objects for which only definition exists and are executable (like views, procedures, functions etc) and rest are objects with definition and data (like tables, indexes etc).
  • the visual interpretation of the object matrix is as shown in the figure 4.
  • the translation metrics is the interpretation of data-type naming convention with the data functionality between the source and target database.
  • ODBC standards support a set of SQL data types, which have a predefined functional interpretation i.e. Datetime data type supports date and time as a combined entity, similarly string data type supports alphanumeric data of specified width.
  • the size of data byte depends on the Unicode standards adopted and the character set of National Language Support (NLS) used by the application to archive data. For example consider an English character set, which can be encoded using different code sets. A code set defines the bit patterns that the system uses to identify characters. A single-byte encoding method is sufficient for representing the English character set because the number of characters is not large. To support larger alphabets such as Chinese additional code sets containing multibyte encoding are necessary.
  • All the supported singlebyte and multibyte code sets must handle character encoding of one or more bytes.
  • the data byte can be classified as Single-Byte Code Sets (SBCS), Multi-Byte Code Sets (MBCS), Double Bytes Code Sets (DBCS) based on version of Unicode standard and NLS setting.
  • SBCS Single-Byte Code Sets
  • MBCS Multi-Byte Code Sets
  • DBCS Double Bytes Code Sets
  • variable character length (varchar) datatype holds string data but nvarchar datatype holds Unicode string data in which the data width depending on Unicode standard adopted.
  • Data types, which are existent in source but missing in target needs a translational map which maps one SQL data type to another without loss of data.
  • Many vendors have a different interpretation of SQL data type with respect to data functionality hence the present invention allow the user to use to deduce his own mapping in case the user wants to override the estimated mapping metrics.
  • objects or data which do not have a clear definition and path for translated migration yields error.
  • FIG. 5 there is shown a flow chart illustrating a preferred embodiment of the present invention showing the mechanism for Migration from ODBC Compliant source format to another ODBC Compliant target format using the Data Migration Service in accordance with the present invention.
  • the first step in the migration process involves the selection of the source from which the data has to be migrated.
  • the user is expected to know the data definition and data manipulation of the source 500.
  • the user manually classifies the nature of source 502 and target clients. It then proceeds to ascertain whether the databases are ODBC complaint 504 or not.
  • the file is zipped, after unzipping its output using the preprocess command is given to the user 506 to manually classify source. Based on the classification the entire process flow can be explained in four combinations.
  • Relational database to relational database migration Firstly Relational database to relational database migration. Secondly Relational database to Non-relational database migration. Thirdly Non-relational database lo relational database migration.
  • the data source is Non-ODBC compliant it could be binary, text or preprocesses command 508. If the data source is ODBC compliant then the user selects the source (to copy data from) ODBC driver for which the Data Source Name (DSN) consisting of appropriate parameters like server name, database user, etc is checked for existence 510. If the DSN does not exist then all required DSN details are obtained 512 and a new DSN connection to the source is created 514. The DSN is the access door to the database. In the event the DSN is created successfully the connection to source database is established or checked 516. If there is an error connecting the source an appropriate error message is reported 518. In the event the connection to source exists then the username and password are checked for authenticity 520.
  • DSN Data Source Name
  • the translation metrics is then derived 542 based on the source database's data type availabilities and whether every available source data type supported in the source database has an equivalent data type and data supporting capabilities.
  • the translation option is then verified with respect to target database capabilities 544.
  • Even on checking the version compliance 538 if the versions are compatible an object-to-object verification, which can be possibly migrated is estimated and a source object count that can be migrated is derived from this 546.
  • an object collection is created 548 from the selection specified in the migration option for which there stands a fair chance of successful object migration.
  • An object count of collection is then derived 550.
  • AJso the existence of another object is checked for 552, after which SQL queries (DDL) are prepared 554as per ODBC adherence and capability matrix and translation metrics supported by the target vendor object creation.
  • DDL SQL queries
  • One or more tables can be selected from this list and also destination table name can be specified as object to be dropped as per option specified 556.
  • an execute object creation script is directly triggered 558. If the user has selected the option to drop the object before firing the creation, the target objects are dropped 560. Such cases may arise firstly if the target database already has legacy data from prior applications. Secondly the existing objects maybe remnants of previous aborted or unsuccessful migrations.
  • the next step involved is the process of checking if the object creation is successful 560.
  • an error is reported and the translation script is resumed 562.
  • the object creation checking for the SQL log option specified is carried out 564.
  • an SQL log option is specified, it is appended to log or an XML file 566.
  • an estimate of source object data count is derived 568.
  • an estimate of source object data count is derived 568.
  • the DDL statements are prepared for target database object with respect to translation metrics 570.
  • the checking is carried out for specified SQL log options if any 572.
  • the DML statements for target database object are prepared with respect to transform metrics 578.
  • the DML statements are then checked to ascertain if the SQL log options are specified 580. In case an SQL log option is specified, it is appended to log or an XML file 582. In the event of no SQL log option specified, the DML statement on the target database object are executed 584. Similarly, after appending to log/XML file 582, the DML statement on the target database object is executed 584.
  • the DML statement is then checked to ascertain if the execution is successful 586. Upon successful execution of the DML statement on the target database, an is reported and also logging as per error condition is carried out 588. In the event of an unsuccessful DML statement execution an intelligent script-based error handler is also supported which allows verifying steps to be executed in case of errors 590. The preferred embodiment of the present invention then checks for any valid steps to handle generated error 592. In the event of any valid existing steps to handle generated error the user-defined steps as per the error conditions are executed 593. The next step involved the flagging/reporting and logging as per error conditions 588. Similarly, in case of no valid step to handle generated errors, the, errors are flagged/reported and logged as per error conditions 588.
  • the Datacount is then checked if it is equal to zero 594. In the event it is not equal to zero, the data count is decreased by one till it is zero 595. After this step, the process of preparing the DML statements for target database object with respect to transform metrics (578 onwards) is carried out till the time the data count becomes zero. On checking if the Datacount is equal to zero 594, in the event that the Datacount is equal to zero, the object count is decreased until it becomes zero 596. If any more objects are remaining it is checked till the time the object count becomes zero 597. In the event that the object count is not equal to zero, the existence of another object is checked for 552,
  • FIG. 6 is a flow diagram illustrating an alternative embodiment depicting the mechanism for Migration from a Binary File to an ODBC compliant format using the present invention.
  • the data definition has to be clearly known by the user.
  • the source data is non-ODBC compliant 600, it is checked to ascertain if it is a Binary File 602. In the event the source file is non-Binary then further processing is done a text or preprocess 604. Further in the event it is a Binary file then it is checked for a single DDL 606. In the event that it is a single DDL, the DDL information is obtained 608. In the event more than one DDL is present the DDL count is obtained 610.
  • the DDL information is sought 612 and saved in DDL collection 614.
  • the DDL count is then decremented 616 and checked if the DDL count becomes equal to zero 618. In the event that the DDL count is not equal to zero, the DDL information us sought 612 again.
  • the binary file is opened 620.
  • the file is then checked for headers 622.
  • the header is skipped 624.
  • Many sources of data such as .DBF from FoxPro / clipper have data definitions as part of data file header.
  • the data is read is as per source DML 626.
  • the data is then checked to ascertain if it has a single DML 628.
  • the mapping of columns as per source DML is carried out 630.
  • the translation of the DML is carried out 632, after which the process of mapping of columns as per source DML is carried out 630.
  • the column count is obtained 634 and the data content is verified as per the DML 636.
  • the data is then checked for correctness 638 and in case if the incorrect data is entered then an error is reported and a log entry is made 640.
  • the constraints are verified 642.
  • the constraints violation is checked for 644. In the event of no constraints violation, then optional values are used in the column 646. In the event if there is constraint violation then a check is carried out for default values of data if one exists 648.
  • the process of updating to default value in column is carried out 650.
  • the updation to null data is carried out 652.
  • the next process involves checking for any further columns 654. If there exists any more columns, it proceeds to translate to the next column data 656. After this step it proceeds to check the data for correctness 638.
  • the End of file (EOF) is checked for 658. If the EOF is not reached then the row values are inserted 660. Next, the execution of user defined business logic script is carried out 662. It is then checked to ascertain if it is successfully executed 664.
  • FIG. 7 is a flow diagram illustrating an alternate embodiment of the present invention, depicting the mechanism for Migration from Text file to ODBC compliant format using the Data Migration Tool of the present invention.
  • the data source is non-ODBC and non-binary it could possibly be text file or it requires preprocess command.
  • the source data is non-ODBC compliant and non-binary 700, it is checked to ascertain if it is a Text File 702. In the event that it is not a text File, it is classified as a pre-process command 704. In an alternate embodiment of the present invention, if the source data is not a Text File, it is checked for further kinds of commands. On classifying the source data as a pre-process command 704, the source file is executed from the pre-process 706 and the source file is derived from the pre-process, 708. Now using this source datafiles, the user has to classify the source data and as per the nature of file further operations are carried out 710. For example if the source file is in Zip format than preprocess command is performed on it that is it is unzipped.
  • the source file so derived is classified by the user as per the nature of the source file and the further operations are carried out as per the file types.
  • the data is then checked for a single DDL 730. If a single DDL is found it is, the columns are mapped as per source DDL 732. In the event that a single DDL is not found, the DDL is translated 734, after which the columns are mapped as per source DDL 732.
  • the column count is obtained 736 and the verification of data content as per DDL is carried out 738.
  • the data is checked for correctness 740. If the data is incorrect, an error is flagged 742. In the event of the data being correct, the conslraints verification is carried out 744. After which, the data is checked for constrainl violations 746. In the event of a no constraint violation then the optional values in column are used 748. In the event of a constraint violation, the default value in column is checked 750. In the event of the default value existing, it is updated to the default data value in the column 752. In the event of the default value of data is nonexistent the null data is updated 754.
  • LOB Large Objects
  • Fig. 8 illustrates a first step of data transfer between Heterogeneous Databases using the 'Data Migration Wizard'.
  • the Data Migration wizard guides the user through a series of simple steps to transfer data between different databases. As shown this is step 1 of the 7 step procedure. It is carried out when the user clicks on the 'ODBC button. This brings up a form titled 'Data Migration Wizard' 800. The 'Next' 805 icon has to be clicked to continue. This brings up the Choose A Data Source (s) page.
  • Fig. 9 illustrates a second step of transfer of data between Heterogeneous Databases using the 'Data Migration Wizard'. As shown in the diagram this is step 2 of the 7-step procedure.
  • Source ODBC driver 900 has to be selected to copy data from the source.
  • DSN 905 that is created using that driver is selected.
  • the 'New DSN' 910 button allows the user to create a new data source.
  • the User Identification (UID) 915 and Password (PWD) 920 is entered for that database and the 'Connect' button 925 is clicked on to connect to the database.
  • 'Next' 930 has to be clicked to proceed and it takes you to the Select Source tables page.
  • Fig. 10 illustrates a third step of transfer of data between Heterogeneous Databases using Data Migration Wizard. As shown this is step 3 of the 7-step procedure.
  • the user clicks 'Next' on the Choose A Data Source (s) page of the Data Migration Wizard it brings up this user interface for Select Source Table for Migration.
  • a list of all the tables in the source database will now be available along with count of records in each table.
  • One or more tables can be selected from this list and also destination table name can be specified.
  • the 'Select AH' 1000 option allows the selection of all the tables available in the selected schema.
  • the 'Next' icon 1005 leads the user to the Choose A Destination.
  • Fig. 11 illustrates a fourth step of transfer of data between Heterogeneous Databases using Data Migration Wizard. As shown this is step 4 of the 7-step process.
  • the ODBC driver 1100 is selected, to copy data to (destination).
  • the 'New DSN' 1110 icon allows you to create new data source.
  • the UID 1115 and PWD 1120 for that database is entered and the 'Connect' icon 1125 to connect to the database. Clicking the 'Next' icon 1130 the user is lead to the 'Select the Options for Migration page'.
  • Fig. 12 illustrates a fifth step of transfer of data between Heterogeneous Databases using Data Migration Wizard. As shown this is step 5 of the 7-step process. After the user clicks 'Next' on the 'Choose a Destination of the Data Migration Wizard', it brings up this user interface for Selecting the Options for Migration.
  • This page allows the user to select the manner in which data transfer should take place.
  • the user can opt between the following: Transfer Table (s) With Data 1200 where entire table along with the records are transferred; Transfer Table structure only where only table definition is transferred; In case of existing tables, the user can opt for one of the following options.
  • Fig. 13 illustrates a sixth step of data transfer between Heterogeneous Databases using the Data Migration Wizard. As shown this is step 6 of the 7-step process. After the user clicks 'Next' on the Selecting the Options for Migration of the Data Migration Wizard, it brings up this user interface For Transform The Column Information'. "Destination Table' 1300, indicating a list of all the selected tables is available to the user for further transformations and the 'Transform' icon 1305 leads the user to the next page.
  • Fig. 14 illustrates the seventh and final step of data transfer between Heterogeneous Databases using the Data Migration Wizard.
  • the Column Information As shown in the diagram, after the user clicks ' 'Transform' on the Transform the Column Information of the Data Migration Wizard, it brings up this user interface For further 'Transform The Column Information'.
  • the Column Mapping feature opens where the user can change the data type, size and precision attribute of the columns of the table one at a time.
  • Selected Columns 1400 includes a list of columns, which can be fetched for the Source table.
  • the data type for each Destination column from the drop down list can be entered in the Data Type field 1405.
  • the length in units corresponding to data type of the destination column can be entered in the Length field 1410.
  • This Length field is only applicable for the char, varchar, nchar, nvarchar, binary and varbinary data types. A size smaller than the length of source table can result in data truncation.
  • Decimal 1415 applies to decimal and numeric data types only where the maximum number of decimal digits that can be stored, to the left and to the right of the decimal point can be entered.
  • the up-arrow button 1420 the columns required for data transfer can be placed in the Available Columns list 1425.
  • the down-arrow button 1430 is used to remove the column from the list. Clicking on the "Apply" icon 1435 stores the transformed values while the 'Cancel' icon 1440 reverts all the changes made by the user. Clicking on the 'Finish' icon 1445 begins the process of transferring data.
  • Fig. 15 illustrates the final report of transfer of data between Heterogeneous Databases using Data Migration Wizard. After the user clicks 'Next' on the 'Transform The Column Information' of the Data Migration Wizard, it brings up this user interface. For Reporting the Data Migration Status after Migration is complete the transfer of data along with the time taken for transfer can be viewed here.
  • clicking on the 'Connect' 1600 icon leads the user to the page described in Fig 17.
  • Fig 17 depicts the page in which .the Connection Parameters are to be inserted.
  • the user is required to select several parameters such as the database 1700 into which the data has to be transferred, enter the UID 1705 & Password 1710 and connects to that database by clicking the 'OK' icon 1715.
  • Fig 18 illustrates a page with an open dialog box from which the file that is to be transferred is selected.
  • Fig 19 illustrates the page that opens on selecting the file that is to be transferred and it displays the sample data from the file.
  • the user After viewing the file, the user is required to enter the file delimiter (if present) in the 'Enter Delimiter' field 1900 as also the column number where the table name is specified 'Enter Column No' field 1905. Clicking on the 'Get Tables' icon 1910 displays a list of tables present in the file.
  • Fig 20 illustrates the File Loader form, wherein by clicking the 'Hex /Binary View (Bytes)' icon 2000, the user can view contents of the flat file in hexadecimal and ASCII formats.
  • the structure of the Byte Arrangement tab is as follows:
  • the leftmost section lists the file position number after every 16 bytes and also shows the file size in bytes shown at the top of the section.
  • the middle section provides a row of 16-byte block information in hexadecimal format.
  • the right section provides the same information in ASCII characters. Characters of ASCII value less than 32 are not visible.
  • Fig 21 illustrates the page the loads on clicking the 'General Parameters' icon 2100 the contents of the file are be displayed. The user is required to specify whether the file is a text file or binary file 2105.
  • the user is required to specify the delimiter (if any) that separates the data in the file.
  • the user is required to classify the records 2110 present in the table.
  • the data in the file is not separated by a delimiter 2115 then specify the number of bytes per record in the space provided.
  • Click 'Generate Fields' button 2120 to generate number of columns in the table.
  • the table structure will be displayed.
  • the users specify the number of bytes per record 2125 and choose the data type for each column from a drop-down box as illustrated in Fig. 21.
  • fig 22 As depicted in fig 22 after the user click on the 'Table Definition' tab 2200.
  • the structure of the table can be redefined.
  • the user can modify the column name, data type, field size or precision/scale as per the requirements.
  • the user can specify the column level 2205 and table level parameters 2210 in the grid provided as illustrated in Fig. 22.
  • the user selects the 'Transformed View' option 2505. This view helps to compare the source data and the target data. In order to view only the transformed data click the 'Hide Source Column ' option 2510 as illustrated in Fig. 25.
  • FIG. 26 shows importing Data from Flat Files using File Loader.
  • a flat file is any text or binary file containing raw data possibly belonging to a legacy database.
  • the File loader helps the user to import data from flat files and display it in a spreadsheet. The steps to be followed to perform this operation are as follows:
  • the first tab is the Bytes Information tab 2600.
  • the user can specify the structure of the flat file. This information is used to correctly load the data contained in the flat file.
  • the user is required to enter the values in the space provided such as total Header Bytes 2605, total footer bytes 2610, total bytes in record 2615 and total columns in record 2620.
  • the user clicks on the 'OK' button 2625 a number of blank rows corresponding to the value specified by the user for total number of columns in Step 1 will be loaded in a spreadsheet As can be illustrated from the next diagram.
  • the user can proceed to click on the 'Process' button 2800. Clicking the 'OK' button opens a file chooser 2805 and the user is prompted to select the file.
  • Fig. 29 The user can click on the 'Bytes Arrangement In the File' tab 2900. After this the user can view contents of the flat file in hexadecimal and ASCII formats.
  • the structure of the Byte Arrangement tab is as shown in the figure the leftmost section lists the file position number 2905 after every 16 bytes and also shows the file size in bytes 2910 shown at the top of the section.
  • the middle section provides a row of 16-byte block information in hexadecimal format 2910.
  • the right section provides the same information in ASCII characters. Characters of ASCII value less than 32 are not visible. If any hex/character value is selected, the corresponding character/hex value is highlighted. Its ASCII and binary values are also displayed at the upper right corner.
  • the user can scroll down the list a row at a time using the arrow keys or a page at a time using the page-up and page-down keys. Additionally the navigational buttons are provided at the upper right corner of the tool window including the Go To button.
  • the user can also search through the list using hex search or (case sensitive) character search using the search fields provided in the data migration tool window. The search fields will be highlighted in the appropriate windows if successful. Printing facility is also available.
  • the data displayed in the image above can be written into a SQL Script 3005 and can be also Inserted into the database 3010 and the appropriate options can be selected.
  • the user can click on the 'Finish' button 3015. If the 'Insert Into Database' 3010 is selected, a 'Logon' form will open.
  • a ogon' screen will open as depicted in fig 31 and the data has to be transferred.
  • the username UID 3110 & password 3115 are entered and connection to that database is established 3120 also other parameter such as DSN 3105, Driver 3125 and server 3130 by clicking the 'OK' button 3135.
  • the data from the file will be inserted in to the corresponding database.

Abstract

The present invention relates generally to the field of data migration and conversion from any source of data. More particularly, the present invention relates to the field of converting heterogeneous data to a user-defined format. For the purpose of sharing data, analyzing and deriving intelligent information from converted data. The present invention enables the features of relational database technologies to be used on non-relational database generated data without programming effort.

Description

TITLE OF INVENTION
System and Method for Data Migration and Conversion
BACKGROUND OF THE INVENTION The evolution and rapid advancements in computer technology has resulted in the creation of numerous software applications that have automated various business processes, thereby benefiting organizations by significantly increasing their efficiency. However all business processes rely on persistent forms of data, which refers to data saved from memory on to the disk. These persistent forms of data are used as the knowledge base for applications that need such data to impart intelligence to business applications. Persistent data was initially saved in application specific formats such as ".doc"; ".xls"; ".ppf" till the concept of databases evolved. Databases stored information only after the required and sufficient data constraints were met. Further, using databases made searching and sorting operations easier.
However the applications never demanded that persistent data format be saved in a database, as requirements never arose and the volumes of data were not sufficiently large enough. Therefore such data was classified in standard data types like "Integer", "Number", "Currency", "Date", "Time", "String" etc. These data types became widely acceptable to vendors and business application developers and they started adhering to these formats of data representation. However with the increase in volume of data, these proprietary file formats became unmanageable and searching information across various files and formats became cumbersome.
As a result of which search algorithms and techniques were derived which could help search and sort the various classified data types. These ever improving algorithms helped to reduce the time of searching and managing persistent data irrespective of the volume. These vendor specific data type classification led to a standard in which data could be classified across various sources of data generation into formats or data types. Initially only few data types were classified but as business demands increased newer and more patterns of data evolved, which resulted in the need to classify such new patterns of data.
Initially there existed limited data types, which had a limited or fixed byte space requirement. In other words, they had a fixed width data type across records, which is essentially a grouping of collective information. Gradually these limitations of fixed size disappeared and variable data types evolved which could save data exactly as per the size of data and not as per the size specified in the data definition. This development resulted in the conservation of disk space, but increased the complexity involved in searching and sorting data. For example, consider an earlier data type with a size of 200 bytes, using variable characters (varchar). One may only provide data which is 110 bytes and the remaining space is returned to system for next data. Now since each data is not equally separated, a search operation on this data is complex, as an integral record size jump is not possible.
Newer attributes were added to define these data types and associate their persistent form requirements such as scale, precision, field width, formats for representation etc. Scale and precision specified bytes consumed before and after decimal places. Formats of representation of data also got standardized as same data when persisted in various format could lead to data misinterpretation. For example date/month/year (dd/mm/yy) wherein date information with values of dates lesser than 13 could be misinterpreted if formats were changed from date/month/year (dd/mm/yyyy) to month/date/year (mm/dd/yyyy).
During this standardization process the development of applications proceeded and many skipped the cycle of acceptability for various reasons. Firstly the switchover time from a proprietary format to a standard format required a lot of redesign and rewrite of applications, which could not be implemented on a existing live system. For example, migration of data was a problem from (a Vendor with) an earlier proprietary format to a standard format since it required considerable amount of redesigning and recoding. Further, people took time to adapt to newer system and this proved to be a significant drawback. In addition, many mission critical applications, which demanded speed, could not accept layers of processes since it added overheads for standardization. Many real-time applications or hardware devices were incapable of supporting requirements necessary for standardization due to their limitation of speed or other hardware parameters or due to stocks in case of older inventories.
In addition, many a time, users of proprietary applications who were dependent or solution vendors for standardization could not switch over as many vendors perished in the technology evolution process and users never had information or code of the system to adapt the change. Also, the lack of competency at the user level to change proprietary systems led to accumulation of large volumes of data which later needed to be shared and converted.
The increase in the volume of data generated per day by business activities led to the rapid development of various players providing improving technology to their own proprietary structures and formats of data storage. Further improvements in system hardware component technology including processor, memory, storage technology, network components etc urged the need for newer, faster and efficient technology for data access. In order to protect an organization from catastrophic loss, data is required to be replicated al two geographically separate locations to provide disaster tolerance. At times older databases suffer lower utilization that could be due lo hardware or/and software limitations. To improve system performance in terms of more efficient storage and retrieval data needs to be migrated to newer, better and improved systems with advanced software and hardware configurations.
There exist several software tools currently, such as middleware, which allows data migration and conversion as well as mapping and exchange of legacy data without disturbing the legacy application. However the existing middleware is very expensive and requires tremendous reprogramming to achieve the desired results. Also, the user or person carrying out the migration needs to have a detailed understanding of both the source as well as the target system for doing valid migration and mapping. There exists no known system or method, which can enable data migration in a cost effective manner, without having an in-depth knowledge of the source and target systems.
Accordingly, a need exists for a system by which we can efficiently handle huge volumes of data, including data in application specific format, which could be ODBC-compliant or non-ODBC complaint data and retrieve data to a common format independent of source of the data and without programming effort. There also exists a need for a data migration tool that allows integration of various heterogeneous data across various Operating Systems to merge with a central standard target database. There exists a further need for features of Data Migration technologies to be implemented and used for non-relational- database data without reprogramming.
SUMMARY OF THE INVENTION
To meet the foregoing needs, the present invention provides a software-implemented process, system, and method for use in a computing environment. The preferred embodiment of the present invention provides a powerful mechanism to transfer data (i.e. Tables) between heterogeneous databases which could either be ODBC / OLEDB / JDBC compliant or non-compliant. It makes it easy to import, export and transform data between any popular data formats including binary, text, databases etc. The present invention provides is a data migration tool used to transfer data between heterogeneous databases and also the file loader is used to import data from flat files.
In an alternative embodiment the present invention is a data migration tool, which allows mapping and extracting of any pattern of data irrespective of source of generation of the data pattern, to a target ODBC / OLEDB / JDBC complaint or any user defined pattern. This entire processes maybe a one time or a periodic effort, which can be configured as a template and scheduled as a job.
The data migration process consisting mainly of three phases Pre-migration, Data Migration Service, Post migration. The Data Migration of the present invention resolves the nearest data type relations. Resolution is based on SQL data types defined by ODBC / OLEDB / JDBC and the target data type expected without compromising the data value or relational integrity.
In the preferred embodiment of the present invention, the Non-ODBC/Non-OLEDB/Non-JDBC compliant (flat files), could be a Text file, a DBMS file, a Binary file or a User Defined Pattern of data based on the interpretation of control character data like, ctrl, carriage return in the file.
If the data source is Non-ODBC compliant it could be binary, text or preprocesses command. If the data source is ODBC compliant then the user selects the source (to copy data from) ODBC driver for which the Data Source Name (DSN) consisting of appropriate parameters like server name, database user, etc is checked for existence.
On successful execution the data migration tool translates the next data record and then reads the data as per the DDL.
BRIEF DESCRIPTION OF THE DRAWINGS
The various objects and advantages of the present invention will become apparent to those of ordinary skill in the relevant art after reviewing the following detailed description and accompanying drawings, wherein
FIG 1 is a block diagram depicting the classification of the source of persistent data as Relational or Non-relational. Further the Non-relational could be Binary, Text, DBMS file with single or multiple DDL. The data after transformation and cleaning is in the target Database.
FIG 2. is a block diagram depicting the Migration process consisting mainly of three phases., the Pre- Migration (PRE), actual Migration (DMS) and Post Migration (POST). FIG. 3 is a block diagram depicting the functional blocks of the preferred embodiments of the present invention.
FIG. 4 is an Object matrix
FIG. 5 (pages a and b) is a flow chart illustrating preferred embodiment showing the mechanism for Migration from one ODBC compliant format to another ODBC compliant format using the Data Migration Tool in accordance with the present invention.
FIG. 6 is a flow chart illustrating an alternative embodiment of the present invention, depicting the mechanism for Migration from text file to ODBC compliant format using the Data Migration Tool in accordance with the present invention.
FIG. 7 is a flow chart illustrating alternative embodiment depicting the mechanism for Migration from binary file to ODBC compliant format using the Data Migration Tool in accordance with the present invention.
Fig. 8 illustrates the first step of transfer of data between Heterogeneous Databases using Data Migration Wizard .Fig. 9 illustrates the second step of transfer of data between Heterogeneous Databases using Data Migration Wizard showings selection for ODBC driver, DSN user Id and Password.
Fig. 10 illustrates the third step of transfer of data between Heterogeneous Databases using Data Migration Wizard depicting the tables to choose for migration.
Fig. 11 illustrates the fourth step of transfer of data between Heterogeneous Databases using Data Migration Wizard depicting the destination ODBC driver, DSN, userid and password. Fig. 12 illustrates the fifth step of transfer of data between Heterogeneous Databases using Data Migration Wizard depicting the selecting the options for migration.
Fig. 13 illustrates the sixth step of transfer of data between Heterogeneous Databases using Data Migration Wizard showing the option to transform the column information.
Fig. 14 illustrates the sixth step of transfer of data between Heterogeneous Databases using Data Migration Wizard depicting the selected columns and available columns.
Fig. 15 illustrates the sixth step of transfer of data between Heterogeneous Databases using Data Migration Wizard for reporting the data Migration Progress.
Fig. 16 illustrates the present invention for importing data from Flat Files using a File Loader
Fig. 17 illustrates the present invention for importing data from Flat Files using a File Loader
Fig. 18 illustrates the present invention for importing data from Flat Files using a File Loader
Fig. 19 illustrates the present invention for importing data from Flat Files using a File Loader
Fig. 20 illustrates the present invention for importing data from Flat Files using a File Loader
Fig. 20 illustrates the present invention for importing data from Flat Files using a File Loader
Fig. 21 illustrates the present invention for importing data from Flat Files using a File Loader
Fig. 22 illustrates the present invention for importing data from Flat Files using a File Loader Fig. 23 illustrates the present invention for importing data From Flat files using a File Loader
Fig. 24 illustrates the present invention for importing data from Flat Files using a File Loader
Fig. 25 illustrates the present invention for importing data from Flat Files using a File Loader
Fig. 26 illustrates the present invention for importing data from Flat Binary Files using a File Loader
Fig. 27 illustrates the present invention for importing data from Flat Binary Files using a File Loader
Fig. 28 illustrates the present invention for importing data from Flat Binary Files using a File Loader
Fig. 29 illustrates the present invention for importing data from Flat Binary Files using a File Loader
Fig. 30 illustrates the present invention for importing data from Flat Binary Files using a File Loader
Fig. 31 illustrates the present invention for importing data from Flat Binary Files using a File Loader
DETAILED DESCRIPTION OF THE INVENTION
While the present invention is susceptible to embodiment in various forms ,as shown in the drawings & will hereinafter be described a presently preferred embodiment with the understanding that the present disclosure is to be considered an exemplification of the invention and is not intended to limit the invention to the specific embodiment illustrated . In the present disclosure, the words "a" or "an" are to be taken to include both a singular and the plural. Conversely any reference to plural items shall, where appropriate include the singular.
In FIG. 1 there is shown a block diagram depicting the classification of the persistent data 100 irrespective of the source of generation as ODBC/OLEDB/JDBC compliant 105 and Non-ODBC/Non- OLEDB/Non-JDBC compliant data-files 110. These generation data sources are endian neutral, also independent of operating systems such as Windows, Linux, MacOS, Unix etc and language, etc. In the preferred embodiment of the present invention, the Non-ODBC/Non-OLEDB/Non-JDBC compliant (flat files) 110, could be a Text file 115, a DBMS file 120, a Binary file 125 or a User Defined Pattern of data 127 based on the interpretation of control character data like , Ctrl, carriage return in the file. Also many proprietary systems saved multiple table data in a single data file, for example Access, to overcome legacy Operating System limitations of managing a certain limit of open file handles. Example Disk Operating System can handle 640 files. The Text or Binary files can be multiple data definition language or single data definition language. Further these types of Non-ODBC/Non-OLEDB/Non-JDBC compliant files could contain multiple Data Definition Language (Multiple DDL) 130 or single Data Definition language (single DDL) 135. Accordingly the transformation and cleansing are applied to these files 140 and stored in the target database 145.
Hence the present invention further classifies this text data stream into a single DDL file 135 or a multiple DDL file 130 i.e. whether the data stream has patterns of data associated with more than one definition. The Text Data File 115 is one which has data separated by a delimiter. Many text files have the classified delimiter, if it has one delimiter then the file is opened and data is read line by line.
The DBMS data file 120 could further be classified as Single DDL 135 or Multiple DDL 130 depending on the interpretation of control character present in this stored in this pattern of data. This data file consists of database management objects like database(s), table(s), etc. The Binary File 125 can also be Multiple DDL 130 or Single DDL 135. The Binary File 125 generally has contiguous blocks of data tuple(s) and there may or may not contain a record delimiter. Many binary files may have an embedded DDL as part of the header definition as in the case of dbase file (DBF) files.
Also User-defined patterns of data 127 are supported irrespective of source of data. These user - defined data files can be classified as having a single DDL 135 or a Multiple DDL 130 as per the interpretation of control characters. The data is stored in such files as patterns of data. But data files created from proprietary sources may have a preamble header, which may not define the accompanying data. The data definition should perfectly specify the header size, the start of data block, the pattern of data expected in a single data byte stream. Many binary data streams which support variable size data have column data length as prefix header for each column data.
Taking care of these DDL constraints, the present invention allows part of the process of cleansing and transformation during the process of extraction itself. Hence any invalid or null entry can be replaced by a predefined default value or any transformation of the column data content such as case changes or format representation can be performed and these translations can be a part of DDL definition itself. Other issues of constraints relating to data validity and occurrences can also be defined using indexing parameters of primary / composite key, uniqueness or duplicates allowed etc. After all transformation and cleansing the data is finally stored in the Target Database.
The present invention allows scripting or programming to be operated on data and user defined error logic to support intelligent decision making during the process of extraction or transformation. Also based on specific values of data encountered while extracting, intelligent decisions can be made and data routed or value based transform can be applied. Intelligent decision support systems can also be triggered when loading the finally transformed data and business intelligence can be derived. Whenever data generated has not been persisted in any of the classified data types there exists a need to convert the data stream into a corresponding data type. The conversion process may require a translation of format if the source and target data have different representation formats.
Hence the popular data migration services and conversion methodology classify this task into four processes Extraction, Transformation, Cleansing and Loading. Each of these processes may have some housekeeping tasks classified as pre-extraction, pre-transformation, pre-cleansing and preloading as per figure.
Referring now to the drawing particularly FIG 2. there is shown is a block diagram of the a Data Migration process consisting mainly of three phases Pre-migration 200, Data Migration Service 205, Post migration 210. The Data Migration Service 205 further consist mainly of four phases, which includes Extraction 215, Transformation 220, Cleansing and 225 Loading 230 (ETCL 235).
Pre-migration 200 is the first phase in migration, which indicates how the user will send the data that is to be migrated, gathering or importing all the data for migration, understanding the Application, use of the data and nature of data to be extracted. If not available/transferable locally, the data can be sent as a compressed file through telnet or email. The cost factor comes into picture largely when one is importing the file through Integrated Services Digital Network (ISDN) lines, thereby incurring transmission costs. Gathering/importing all the data for migration may involve decompressing the received files that are transmitted in a compressed format.
After Gathering the data the next phase is Extraction 215, Transformation 220, Cleansing 225 and Loading 230(ETCL 235) of data. In the second phase of migration, the actual ETCL 235 process is undertaken to migrate the data to the desired Destination. Data Extraction 215 is the process of gathering data from multiple heterogeneous data sources. It involves reading data from one or more data sources irrespective of whether these data sources can be relational or non-relational (Flat file structures)
Transformation 220 is the process of converting data from legacy or host format to target format. In this stage, any changes to the data types of the extracted data can be effected so that data is in the required target format.
In this Cleansing phase 225, the data that has been extracted is scanned for any errors that may have occurred during the extraction process. These errors may be due to missing, incomplete data, noisy data containing errors or values that deviate from the expected, unwanted data that is part of the source data but not required in the destination etc. These errors can be handled automatically or as per the user specifications.
This Loading phase 230 of the data migration process involves sorting, summarizing and consolidating the data. During this phase the data is in finished format compatible with the destination. Constraints may be built on the data like integrity checks, views may be computed and the data can be partitioned and indexed to optimise its management. By the end of this phase the data is ready to be transferred to the destination.
The third phase referred to as POST-Migration 210 involves exporting the transformed data back to the user. Depending on its size, the data is compressed and sent to the user through the desired channel. For security purpose the target file can be encrypted to the destination. Again, this phase can be compared to the delivery of finished goods in a manufacturing scenario.
Fig. 3 is a block diagram depicting the functional blocks of the preferred embodiment of the present invention. As shown in the block diagram the Network Interface 302 helps the data migration tool of the present invention to connect to the source data by either linking or mapping files based on raw DataStream consisting of patterns of data.
The ODBC interface 300 is another source data ports which help the data migration tool of the present invention to connect to the source data which is ODBC compliant using an ODBC compliant driver.
The User Interface layer 305 is an interactive intuitive tool which helps user to define their own data scripts, constraints, translation / transformation functionality without delving into syntactical requirements.
The functionality of the present invention enables it to be capable of interpreting any pattern of data. Hence non-ODBC data sources may need to be parsed (such as XML data) or a simple delimited CSV (Comma separated values) data may require primitive parsing. As shown in the block diagram the Parsing Engine 310 parses only script files such as an XML file. In many cases where the user defines scripts or event handlers, the input may have to be validated for syntactical errors. The Parsing Engine 310 executes these responsibilities.
The Data Dictionary 315 is an object definition and a constraint repository, which is created as soon as the source and target data sources are connected. Its primary function is to hold the object defining entities and the translating parameters between the source and target object which serves as a lookup table to all transforming, translating and mapping functional entities of the conversion and hence aiding the migration using the present invention. The data dictionary allows to correctly predict an update operation since write in a fixed pattern legacy data either has to be at an integral multiple of record size or exactly at an offset position within a record which is an integral addition to its column position or column offset.
The Data Mapper 320 resolves the nearest data type relations. Resolution is based on SQL data types defined by ODBC and the target data type expected without compromising the data value or relational integrity. This process essentially involves gauging of the data width, scale, precision and other transformation aspects with respect to the source and target data type definitions implemented by database vendors. For example 'Currency' and 'Money' have the same functionality but their naming differentiate these data types. Hence the Data Mapper 320 maps 'Currency' to 'Money' in the Data Dictionary 315. This mapping is done earlier in the process when a connection is established from the source to the target. The mapper uses the "best-fit" or "best-available" datatype mapping between source and target (only in case of mismatches or missing datatype in target database) using scale and precision mapping of the source and target datatypes so that the data width does not harm data loss or quality. In case a corresponding scale and precision datatype is found the nearest pattern of data (string to string, number to number or float or double or currency etc..) is mapped so that any conversion process from source column to target column is avoided.
The Data Translator 325 is an extension of data mapping functionality. When there exists no corresponding equivalent implementation of the source datatype in the target database. For example, data translator maps "texf datatype sources to "variable" datatype in target. The Data Translator 325 maps the most likely data type with corresponding attributes like scale, precision etc. These mapping of attributes can service this data functionality without compromising data handling, managing and manipulating capabilities. For example a variable character ('varchar') datatype in SQL source is 8k while in Oracle, the target varchar is 4k.
The Data Transformer 330 is a set of predefined logical, arithmetic and data manipulation functionalities, which help in data cleansing, formatting and loading. A typical example being the trimming of spaces before and after or padding of spaces etc. It also allows changing a datatype from ODBC compliant to non-ODBC compliant.
The Data Formatter 335 handles the presentation logic of the source data before being loaded to the target system. This generally is based on the constraints dictated by the rules of the target database. For example, in a relational database or an XSLT, if a XML data or any format in which the user dictated delimiter and representation of data, masks the source data output.
Often the rules or features supported by the source database and target database may differ. The constraints defined in the source database by the user may not be valid for the target database. In many cases the functionality required to implement constraints is missing in the target databases. The Constraint Validator 340 is a functional translator, which works on equivalent functional features rather than data types like the data translator. It also verifies that the transformed data does not inherit any process loss, which violates the rules of target database. Consider the example of source database as Access having functions like 'Now()', the required conversion to MS SQL as target database is required, however in MS-SQL the corresponding equivalent function for 'NowQ' is timestamp and therefore error has to be reported to user after Data Migration is complete.
The Script Engine 345 is a programmable language, execution engine like one found in 'MS Word', which can execute scripts similar to Visual Basic (VB) script, Javascript or a database Procedural language (PL) like Oracle. The Script Engine 345 executes user-defined programming functionality to be operated on the source data for defining business logic. These generally maybe a set of commands executed upon and by data value rather than operations like trigger which work on target databases. This is useful for defining and accumulating data related business intelligence information and is an indispensable tool.
The Data Loader 350 chooses methods, depending upon initial data analysis. The process of loading may require temporary storage or staged database, which is managed by the Data Loader 350.
The Event Handler 355 is a set of predefined call back functions designed to work with preset patterns of data and associated function operations with the set of data taking consideration of various variables and runtime parameters, which can be operative on that set pattern of data. In other words the Event Handler 355 carries out DDL and DML operations. The Scheduler 360 as the name suggests is the primary messaging kernel, which controls and schedules various block functionalities as per request operation execution. In conjunction with the Scheduler 360, the Event Handler 355 can be used for communicating various instances of tools running across distributed network environment and evoking and executing remote operations.
In FIG. 4 there is shown the process of deriving the Migration Capability Matrix, based on the adherence of ODBC drivers to standards and functionality supported by the database vendor. Hence a complete database migration may not be possible for the following reasons:
Firstly the drivers do not support all object migration i.e. the catalog of functions of the ODBC drivers do not reveal enough information for complete object migration. Secondly, the information revealed by driver maybe obfuscated. Thirdly Architecture differences between the source database and target database may not support object migration. Fourthly even when same object types are supported across the source database and the target database the features supported by source vendor for the object may not be available for the target database. For example Calculated columns.
The Capability Matrix 400 isolates the above mentioned discrepancies and finds those object types 405, which can be migrated such as Tables, Index, Views etc. For a database server the objects 410 are databases, tables, views, etc and the hierarchy of an object 410 is from a server to database, from database to table space, from table space to users, from users to table(s), from tables 420 to views or indexes 425 etc. These hierarchies of object types 405 dictates the sequence of object creation i.e. a database has to be existent or a new database has to be created. Like wise if the target database supports tablespaces, which exists in the source too, then tablespaces are created as per source properties. Many databases have a concept of Filegroups which has a similar functionality of a tablespace in Oracle hence such object mapping is also derived based on the source vendor with the help of a generic object data dictionary shown in Fig. 1 which is depicted in the architectural block diagram. Hence prior to the process of migration and deriving these capabilities a little groundwork about the server its resources and all objects, which are required before a database Data Source Name (DSN) connect that is a database user or schema is assumed to be existent else an existing database with a known user schema can be used to create these. While determining the precedence and sequence of object creation an analysis of object with respect to the data in the object is performed that is all object types 405 which the source and target database support are broadly classified into object with data 430 or without data 435 that is objects for which only definition exists and are executable (like views, procedures, functions etc) and rest are objects with definition and data (like tables, indexes etc). The visual interpretation of the object matrix is as shown in the figure 4.
The translation metrics is the interpretation of data-type naming convention with the data functionality between the source and target database. ODBC standards support a set of SQL data types, which have a predefined functional interpretation i.e. Datetime data type supports date and time as a combined entity, similarly string data type supports alphanumeric data of specified width. The size of data byte depends on the Unicode standards adopted and the character set of National Language Support (NLS) used by the application to archive data. For example consider an English character set, which can be encoded using different code sets. A code set defines the bit patterns that the system uses to identify characters. A single-byte encoding method is sufficient for representing the English character set because the number of characters is not large. To support larger alphabets such as Chinese additional code sets containing multibyte encoding are necessary. All the supported singlebyte and multibyte code sets must handle character encoding of one or more bytes. Hence the data byte can be classified as Single-Byte Code Sets (SBCS), Multi-Byte Code Sets (MBCS), Double Bytes Code Sets (DBCS) based on version of Unicode standard and NLS setting. These information derived from the database after successful connection via various ODBC Application Program Interface (API) calls executed across source and target database yields all SQL data type sets supported by each vendor. The metrics maps common SQL data types supported by both (source and target), irrespective of their naming convention based on their scales and precisions and capability to hold such character sets. For example: the variable character length (varchar) datatype holds string data but nvarchar datatype holds Unicode string data in which the data width depending on Unicode standard adopted. Data types, which are existent in source but missing in target needs a translational map which maps one SQL data type to another without loss of data. Many vendors have a different interpretation of SQL data type with respect to data functionality hence the present invention allow the user to use to deduce his own mapping in case the user wants to override the estimated mapping metrics. In either of the cases of object matrix or translation metrics, objects or data, which do not have a clear definition and path for translated migration yields error.
As illustrated in FIG. 5 there is shown a flow chart illustrating a preferred embodiment of the present invention showing the mechanism for Migration from ODBC Compliant source format to another ODBC Compliant target format using the Data Migration Service in accordance with the present invention.
The first step in the migration process involves the selection of the source from which the data has to be migrated. The user is expected to know the data definition and data manipulation of the source 500. Next, the user manually classifies the nature of source 502 and target clients. It then proceeds to ascertain whether the databases are ODBC complaint 504 or not. In the event the file is zipped, after unzipping its output using the preprocess command is given to the user 506 to manually classify source. Based on the classification the entire process flow can be explained in four combinations.
Firstly Relational database to relational database migration. Secondly Relational database to Non-relational database migration. Thirdly Non-relational database lo relational database migration.
Fourthly Non-relational database to Non-relational database migration.
If the data source is Non-ODBC compliant it could be binary, text or preprocesses command 508. If the data source is ODBC compliant then the user selects the source (to copy data from) ODBC driver for which the Data Source Name (DSN) consisting of appropriate parameters like server name, database user, etc is checked for existence 510. If the DSN does not exist then all required DSN details are obtained 512 and a new DSN connection to the source is created 514. The DSN is the access door to the database. In the event the DSN is created successfully the connection to source database is established or checked 516. If there is an error connecting the source an appropriate error message is reported 518. In the event the connection to source exists then the username and password are checked for authenticity 520. Incase of failure to connect to source due to unauthentic user/password an appropriate error is reported 518. In the event that the user name & password for that database are authentic, then the database objects that is the tables are read as per user rights/privileges 522. The preferred embodiment of the present invention internally reads the definition of all available objects as per the user appropriate rights and privileges. The target database is then chosen 524 and the DSN is checked for existence 526. In the event if the DSN does not exists then all required DSN details are obtained 528 and the new DSN connection to the source is created 530. The DSN is the access door to the database. In the event the DSN is in existence or is created successfully, the connection to target database is established 532. If there is an error connecting the target an appropriate error message is reported 518. In the event the connection to target database is established the verification of the source and target database complaints with respect to vendors, product, etc is carried out 534. In case the source and target databases are different, the evaluation of the target database capabilities with respect to the requirements of the source database is carried out 536. In case the source database and target database are from the same vendor, the version-compliance is checked 538. In the event if the versions are not compatible then next the estimation of the target database capabilities with respect to the requirements of the source database 536 is carried out. The Capability Matrix is then derived 540 to estimate the success of data migration. The translation metrics is then derived 542 based on the source database's data type availabilities and whether every available source data type supported in the source database has an equivalent data type and data supporting capabilities. The translation option is then verified with respect to target database capabilities 544. Based on user selection of objects and adherence of the vendors to ODBC standards an object-to-object verification, which can be possibly migrated is estimated and a source object count that can be migrated is derived from this 546. Even on checking the version compliance 538, if the versions are compatible an object-to-object verification, which can be possibly migrated is estimated and a source object count that can be migrated is derived from this 546. Following this process, an object collection is created 548 from the selection specified in the migration option for which there stands a fair chance of successful object migration. An object count of collection is then derived 550.
AJso the existence of another object is checked for 552, after which SQL queries (DDL) are prepared 554as per ODBC adherence and capability matrix and translation metrics supported by the target vendor object creation. One or more tables can be selected from this list and also destination table name can be specified as object to be dropped as per option specified 556. In the event of no drop object specified in the option, an execute object creation script is directly triggered 558. If the user has selected the option to drop the object before firing the creation, the target objects are dropped 560. Such cases may arise firstly if the target database already has legacy data from prior applications. Secondly the existing objects maybe remnants of previous aborted or unsuccessful migrations.
The next step involved is the process of checking if the object creation is successful 560. In the event of an unsuccessful object creation, an error is reported and the translation script is resumed 562. In case the object creation is successful, the object creation checking for the SQL log option specified is carried out 564. In case an SQL log option is specified, it is appended to log or an XML file 566. In the event of no SQL log option specified, an estimate of source object data count is derived 568. Similarly, after appending to log or an XML file 566, an estimate of source object data count is derived 568. After this step, the DDL statements are prepared for target database object with respect to translation metrics 570. Next, the checking is carried out for specified SQL log options if any 572. In the event of an SQL log option specified an SQL statement is appended to log or an XML file 574. If no SQL log option is specified the DDL statements on the target database object are executed 576. Similarly, after appending the SQL to log or XML file 574, the DDL statements on target database object are executed 576.
After the above process, the DML statements for target database object are prepared with respect to transform metrics 578. The DML statements are then checked to ascertain if the SQL log options are specified 580. In case an SQL log option is specified, it is appended to log or an XML file 582. In the event of no SQL log option specified, the DML statement on the target database object are executed 584. Similarly, after appending to log/XML file 582, the DML statement on the target database object is executed 584.
The DML statement is then checked to ascertain if the execution is successful 586. Upon successful execution of the DML statement on the target database, an is reported and also logging as per error condition is carried out 588. In the event of an unsuccessful DML statement execution an intelligent script-based error handler is also supported which allows verifying steps to be executed in case of errors 590. The preferred embodiment of the present invention then checks for any valid steps to handle generated error 592. In the event of any valid existing steps to handle generated error the user-defined steps as per the error conditions are executed 593. The next step involved the flagging/reporting and logging as per error conditions 588. Similarly, in case of no valid step to handle generated errors, the, errors are flagged/reported and logged as per error conditions 588.
The Datacount is then checked if it is equal to zero 594. In the event it is not equal to zero, the data count is decreased by one till it is zero 595. After this step, the process of preparing the DML statements for target database object with respect to transform metrics (578 onwards) is carried out till the time the data count becomes zero. On checking if the Datacount is equal to zero 594, in the event that the Datacount is equal to zero, the object count is decreased until it becomes zero 596. If any more objects are remaining it is checked till the time the object count becomes zero 597. In the event that the object count is not equal to zero, the existence of another object is checked for 552,
In the event the object count is zero, the completion of the migration is reported to the user 598. The above process is followed when the source and target are ODBC complaint databases.
FIG. 6 is a flow diagram illustrating an alternative embodiment depicting the mechanism for Migration from a Binary File to an ODBC compliant format using the present invention. For transferring of a non-ODBC complaint source data to a target ODBC compliant 105 database the data definition has to be clearly known by the user. If the source data is non-ODBC compliant 600, it is checked to ascertain if it is a Binary File 602. In the event the source file is non-Binary then further processing is done a text or preprocess 604. Further in the event it is a Binary file then it is checked for a single DDL 606. In the event that it is a single DDL, the DDL information is obtained 608. In the event more than one DDL is present the DDL count is obtained 610. After this step, the DDL information is sought 612 and saved in DDL collection 614. The DDL count is then decremented 616 and checked if the DDL count becomes equal to zero 618. In the event that the DDL count is not equal to zero, the DDL information us sought 612 again.
In the event that the DDL count is equal to zero, the binary file is opened 620. The file is then checked for headers 622. In the event of a header present, the header is skipped 624. Many sources of data such as .DBF from FoxPro / clipper have data definitions as part of data file header. In case the file has a header or after the header is skipped, the data is read is as per source DML 626. The data is then checked to ascertain if it has a single DML 628. In the event of a single DML, the mapping of columns as per source DML is carried out 630. In case of non-single DML, the translation of the DML is carried out 632, after which the process of mapping of columns as per source DML is carried out 630. After this step, the column count is obtained 634 and the data content is verified as per the DML 636. The data is then checked for correctness 638 and in case if the incorrect data is entered then an error is reported and a log entry is made 640. In the event of the data being correct, the constraints are verified 642. Next the constraints violation is checked for 644. In the event of no constraints violation, then optional values are used in the column 646. In the event if there is constraint violation then a check is carried out for default values of data if one exists 648. In the event the default values exists then the process of updating to default value in column is carried out 650. In the event of default value of data is non- existent then the updation to null data is carried out 652. The next process involves checking for any further columns 654. If there exists any more columns, it proceeds to translate to the next column data 656. After this step it proceeds to check the data for correctness 638.
On checking for any further columns 654, if no further columns exist, the End of file (EOF) is checked for 658. If the EOF is not reached then the row values are inserted 660. Next, the execution of user defined business logic script is carried out 662. It is then checked to ascertain if it is successfully executed 664.
In the event of unsuccessful execution an appropriate error is flagged 666. On successful execution it proceeds to translate the next data record 668. It then proceeds to the read the data as per source DML
626. However, on checking for EOF 658, if the EOF is reached then the file is closed 670. FIG. 7 is a flow diagram illustrating an alternate embodiment of the present invention, depicting the mechanism for Migration from Text file to ODBC compliant format using the Data Migration Tool of the present invention.
Often if the data source is non-ODBC and non-binary it could possibly be text file or it requires preprocess command.
If the source data is non-ODBC compliant and non-binary 700, it is checked to ascertain if it is a Text File 702. In the event that it is not a text File, it is classified as a pre-process command 704. In an alternate embodiment of the present invention, if the source data is not a Text File, it is checked for further kinds of commands. On classifying the source data as a pre-process command 704, the source file is executed from the pre-process 706 and the source file is derived from the pre-process, 708. Now using this source datafiles, the user has to classify the source data and as per the nature of file further operations are carried out 710. For example if the source file is in Zip format than preprocess command is performed on it that is it is unzipped. The source file so derived is classified by the user as per the nature of the source file and the further operations are carried out as per the file types. On checking the source data to ascertain if it is a Text File 702, if it is found to be Text File the next step involves the identification of the delimiter 712 in such a Text File. After such delimiters are identified, they are classified 714 and the file is opened 716. However, if the data in the text file does not contains delimiters then it is checked for marked Language 718. After the language is identified, it is checked for the format 720 and associate definition 722, after which it proceeds to open the file 716. Once the file is opened 716, it is read line by line 724 and columns are created as per the delimiters 726. The data is then read as per DDL 728.
The data is then checked for a single DDL 730. If a single DDL is found it is, the columns are mapped as per source DDL 732. In the event that a single DDL is not found, the DDL is translated 734, after which the columns are mapped as per source DDL 732.
After corresponding data mapping as per source, the column count is obtained 736 and the verification of data content as per DDL is carried out 738. Next, the data is checked for correctness 740. If the data is incorrect, an error is flagged 742. In the event of the data being correct, the conslraints verification is carried out 744. After which, the data is checked for constrainl violations 746. In the event of a no constraint violation then the optional values in column are used 748. In the event of a constraint violation, the default value in column is checked 750. In the event of the default value existing, it is updated to the default data value in the column 752. In the event of the default value of data is nonexistent the null data is updated 754.
It then proceeds to check for any further columns 756. In the event of the existence of any further columns, it proceeds to translate the next column data 758, after which the data is checked for correctness 740. However, if no further columns exist, it is checked for EOF 760. If the EOF is reached, it proceeds to close the file 762. If it is not an EOF, it proceeds to insert the row values 764. The next step involves the execution of user defined business logic script 766. It then proceeds to check for successful execution 768. On successful execution it proceeds to translate next data record 770 and then reads the data as per the DLL 728. However, in the event of unsuccessful execution it flags an appropriate Error 772. Certain exceptions are followed when the source and target both support data types like LOBs (LOB stands for Large Objects). Most of the current data migration tools do not support LOB data migration unless the database vendor himself provides the migration tool and the target database vendor supports him. The current data migration tool has the capability for mapping and migrating the correct LOB translations and verifying successful migration.
Fig. 8 illustrates a first step of data transfer between Heterogeneous Databases using the 'Data Migration Wizard'. In the process of data transfer between Heterogeneous Databases Using Data Migration Wizard. The Data Migration wizard guides the user through a series of simple steps to transfer data between different databases. As shown this is step 1 of the 7 step procedure. It is carried out when the user clicks on the 'ODBC button. This brings up a form titled 'Data Migration Wizard' 800. The 'Next' 805 icon has to be clicked to continue. This brings up the Choose A Data Source (s) page.
Fig. 9 illustrates a second step of transfer of data between Heterogeneous Databases using the 'Data Migration Wizard'. As shown in the diagram this is step 2 of the 7-step procedure. After the user clicks 'Next' on the 'Data Migration Wizard', it brings up this Choose A Data Source (s)* page. The, Source ODBC driver 900 has to be selected to copy data from the source. Then the DSN 905 that is created using that driver is selected. The 'New DSN' 910 button allows the user to create a new data source. Thus the User Identification (UID) 915 and Password (PWD) 920 is entered for that database and the 'Connect' button 925 is clicked on to connect to the database. 'Next' 930 has to be clicked to proceed and it takes you to the Select Source tables page.
Fig. 10 illustrates a third step of transfer of data between Heterogeneous Databases using Data Migration Wizard. As shown this is step 3 of the 7-step procedure. After the user clicks 'Next' on the Choose A Data Source (s) page of the Data Migration Wizard, it brings up this user interface for Select Source Table for Migration. A list of all the tables in the source database will now be available along with count of records in each table. One or more tables can be selected from this list and also destination table name can be specified. The 'Select AH' 1000 option allows the selection of all the tables available in the selected schema. The 'Next' icon 1005 leads the user to the Choose A Destination.
Fig. 11 illustrates a fourth step of transfer of data between Heterogeneous Databases using Data Migration Wizard. As shown this is step 4 of the 7-step process. After the user clicks Next on the 'Choose A Data Source (Destination)' page of the Data Migration Wizard, it brings up this user interface for Choosing a Destination. The ODBC driver 1100 is selected, to copy data to (destination). Select the DSN 1105 that is created using the driver from the list given below. The 'New DSN' 1110 icon allows you to create new data source. The UID 1115 and PWD 1120 for that database is entered and the 'Connect' icon 1125 to connect to the database. Clicking the 'Next' icon 1130 the user is lead to the 'Select the Options for Migration page'.
Fig. 12 illustrates a fifth step of transfer of data between Heterogeneous Databases using Data Migration Wizard. As shown this is step 5 of the 7-step process. After the user clicks 'Next' on the 'Choose a Destination of the Data Migration Wizard', it brings up this user interface for Selecting the Options for Migration.
This page allows the user to select the manner in which data transfer should take place. The user can opt between the following: Transfer Table (s) With Data 1200 where entire table along with the records are transferred; Transfer Table structure only where only table definition is transferred; In case of existing tables, the user can opt for one of the following options.
Show error 1205, where the table is not transferred but an error is displayed. Drop The Table 1210, where the existing table is first dropped and a new table is created. All existing data in the destination table are permanently deleted. Append The Records 1220, where the table is retained and just records are appended to it. Clicking on 'Next' 1225 icon takes the user to the 'Transform The Column Information' page. Fig. 13 illustrates a sixth step of data transfer between Heterogeneous Databases using the Data Migration Wizard. As shown this is step 6 of the 7-step process. After the user clicks 'Next' on the Selecting the Options for Migration of the Data Migration Wizard, it brings up this user interface For Transform The Column Information'. "Destination Table' 1300, indicating a list of all the selected tables is available to the user for further transformations and the 'Transform' icon 1305 leads the user to the next page.
Fig. 14 illustrates the seventh and final step of data transfer between Heterogeneous Databases using the Data Migration Wizard. As shown in the diagram, after the user clicks ' 'Transform' on the Transform the Column Information of the Data Migration Wizard, it brings up this user interface For further 'Transform The Column Information'. On pressing 'Transform' icon in the previous diagram, the Column Mapping feature opens where the user can change the data type, size and precision attribute of the columns of the table one at a time. Selected Columns 1400 includes a list of columns, which can be fetched for the Source table. The data type for each Destination column from the drop down list can be entered in the Data Type field 1405. The length in units corresponding to data type of the destination column can be entered in the Length field 1410. This Length field is only applicable for the char, varchar, nchar, nvarchar, binary and varbinary data types. A size smaller than the length of source table can result in data truncation. Decimal 1415 applies to decimal and numeric data types only where the maximum number of decimal digits that can be stored, to the left and to the right of the decimal point can be entered. By using the up-arrow button 1420 the columns required for data transfer can be placed in the Available Columns list 1425. The down-arrow button 1430 is used to remove the column from the list. Clicking on the "Apply" icon 1435 stores the transformed values while the 'Cancel' icon 1440 reverts all the changes made by the user. Clicking on the 'Finish' icon 1445 begins the process of transferring data.
Fig. 15 illustrates the final report of transfer of data between Heterogeneous Databases using Data Migration Wizard. After the user clicks 'Next' on the 'Transform The Column Information' of the Data Migration Wizard, it brings up this user interface. For Reporting the Data Migration Status after Migration is complete the transfer of data along with the time taken for transfer can be viewed here.
In order to import data from text files and display it in a spreadsheet, the steps include clicking on the 'Non-ODBC icon. This brings up a form titled File Loader, which is illustrated herein in Fig 16.
As illustrated in Fig 16, clicking on the 'Connect' 1600 icon leads the user to the page described in Fig 17. Clicking on the 'Import' icon 1605 on the other hand leads the user to the page described in Fig 18.
Fig 17 depicts the page in which .the Connection Parameters are to be inserted. The user is required to select several parameters such as the database 1700 into which the data has to be transferred, enter the UID 1705 & Password 1710 and connects to that database by clicking the 'OK' icon 1715.
Fig 18 illustrates a page with an open dialog box from which the file that is to be transferred is selected.
Fig 19 illustrates the page that opens on selecting the file that is to be transferred and it displays the sample data from the file.
After viewing the file, the user is required to enter the file delimiter (if present) in the 'Enter Delimiter' field 1900 as also the column number where the table name is specified 'Enter Column No' field 1905. Clicking on the 'Get Tables' icon 1910 displays a list of tables present in the file.
On selecting the tables the user wants to transfer, one is required to double click on the table the user wants to view.
Fig 20 illustrates the File Loader form, wherein by clicking the 'Hex /Binary View (Bytes)' icon 2000, the user can view contents of the flat file in hexadecimal and ASCII formats. The structure of the Byte Arrangement tab is as follows:
The leftmost section lists the file position number after every 16 bytes and also shows the file size in bytes shown at the top of the section. The middle section provides a row of 16-byte block information in hexadecimal format. The right section provides the same information in ASCII characters. Characters of ASCII value less than 32 are not visible.
If any hex/character value is selected, the corresponding character/hex value is highlighted. Its ASCII and binary values are also displayed at the upper right corner.
Fig 21 illustrates the page the loads on clicking the 'General Parameters' icon 2100 the contents of the file are be displayed. The user is required to specify whether the file is a text file or binary file 2105.
In the event if the file is a Text file then, the user is required to specify the delimiter (if any) that separates the data in the file. The user is required to classify the records 2110 present in the table. In case the data in the file is not separated by a delimiter 2115 then specify the number of bytes per record in the space provided. Click 'Generate Fields' button 2120 to generate number of columns in the table.
The table structure will be displayed. The users specify the number of bytes per record 2125 and choose the data type for each column from a drop-down box as illustrated in Fig. 21.
As depicted in fig 22 after the user click on the 'Table Definition' tab 2200. Here the structure of the table can be redefined. The user can modify the column name, data type, field size or precision/scale as per the requirements. Also the user can specify the column level 2205 and table level parameters 2210 in the grid provided as illustrated in Fig. 22.
As depicted in Fig 23, after the user clicks on the 'Constraints/ Indexes' tab. The user can specify the constraints in the table named 'Constraints' 2305. In case of foreign key the reference table is specified. The index options i.e. name, type, field name, sort order can be specified in the table named 'Indexes' 2310.
As depicted in Fig 24 after the user clicks on the 'Datasheet View' tab 2400. The data in the table will be displayed in a tabular form.
As depicted in Fig. 25 to view the transformed table, the user selects the 'Transformed View' option 2505. This view helps to compare the source data and the target data. In order to view only the transformed data click the 'Hide Source Column ' option 2510 as illustrated in Fig. 25.
In order to save the scripts in a file select 'Prepare SQL Script' option 2515 as illustrated in Fig. 25. In order to transfer the data into the selected database, select 'Insert into Database' option.
Once the structure is correctly defined and the data is put as per the requirement, the user clicks on the 'Finish' button 2525 to start transferring data from file to the Destination Database as illustrated in Fig. 25.
As illustrated in Fig. 26 shows importing Data from Flat Files using File Loader. A flat file is any text or binary file containing raw data possibly belonging to a legacy database. The File loader helps the user to import data from flat files and display it in a spreadsheet. The steps to be followed to perform this operation are as follows:
Once the user clicks on the 'Binary File' button. This will bring up a form titled File Loader. The first tab is the Bytes Information tab 2600. Here the user can specify the structure of the flat file. This information is used to correctly load the data contained in the flat file.
The user is required to enter the values in the space provided such as total Header Bytes 2605, total footer bytes 2610, total bytes in record 2615 and total columns in record 2620. After the user clicks on the 'OK' button 2625 a number of blank rows corresponding to the value specified by the user for total number of columns in Step 1 will be loaded in a spreadsheet As can be illustrated from the next diagram.
As illustrated in Fig. 27 after the user clicks on the 'OK' button 2700. The user is required to enter the Column Information in the spreadsheet i.e. the Column Name2705, the bytes required by each column 2710, column data type 2715 and the color 2720 as shown in the Fig. 27.
After this, as illustrated in Fig. 28, the user can proceed to click on the 'Process' button 2800. Clicking the 'OK' button opens a file chooser 2805 and the user is prompted to select the file.
As illustrated in Fig. 29 The user can click on the 'Bytes Arrangement In the File' tab 2900. After this the user can view contents of the flat file in hexadecimal and ASCII formats. The structure of the Byte Arrangement tab is as shown in the figure the leftmost section lists the file position number 2905 after every 16 bytes and also shows the file size in bytes 2910 shown at the top of the section. The middle section provides a row of 16-byte block information in hexadecimal format 2910. The right section provides the same information in ASCII characters. Characters of ASCII value less than 32 are not visible. If any hex/character value is selected, the corresponding character/hex value is highlighted. Its ASCII and binary values are also displayed at the upper right corner. The user can scroll down the list a row at a time using the arrow keys or a page at a time using the page-up and page-down keys. Additionally the navigational buttons are provided at the upper right corner of the tool window including the Go To button. The user can also search through the list using hex search or (case sensitive) character search using the search fields provided in the data migration tool window. The search fields will be highlighted in the appropriate windows if successful. Printing facility is also available.
As illustrated in Fig. 30 the user click on the 'Bytes Transfer In to the Records' tab 3000, where the exported data is displayed in a tabular manner. As illustrated in the user interface the data displayed in the image above can be written into a SQL Script 3005 and can be also Inserted into the database 3010 and the appropriate options can be selected.
The user can click on the 'Finish' button 3015. If the 'Insert Into Database' 3010 is selected, a 'Logon' form will open.
After the user selects the insert into database a ogon' screen will open as depicted in fig 31 and the data has to be transferred. After the username UID 3110 & password 3115 are entered and connection to that database is established 3120 also other parameter such as DSN 3105, Driver 3125 and server 3130 by clicking the 'OK' button 3135. The data from the file will be inserted in to the corresponding database.

Claims

What is claimed is:
1. A method of migrating relational and non-relational source data, independent of the source of generation to a target data system, comprising the steps: creating object definitions, rules and constraints once the plurality of source data and corresponding plurality of target data have been linked; mapping data from said plurality of source data to corresponding said plurality of target data using said definitions in memory store; translating the unmapped source data to the logically similar target data using a predetermined driver wherein the process involves no loss of said source data while migrating to said target data and retains original information of said source data; formatting the said translated data types as per corresponding said plurality of target data using predetermined operations; and a memory storage means to store the said object definitions, rules and constraints once the said plurality of source data and corresponding said plurality of target data have been linked
2. The method as recited in claim 1 wherein said rules, definitions and constraints are isolated from said data migration application whereby said rules can be modified, deleted, added into without any rewrite of said data migration application code.
3. The method as recited in claim 1 wherein the migration of said plurality of said source data is done using harmonics of pattern in said plurality of said source data.
4. The method as recited in claim 1 wherein triggers can be incorporated before, after or during every step of said migration process to script, track and execute user defined functionalities.
5. The method as recited in claim 4 wherein the application is designed whereby said user retains the control to halt said migration process intermediately and change predetermined conditions of trigger and thereafter resume said migration process without having to restart said migration process from the beginning.
6. The method as recited in claim 1 wherein said migration process can be carried out temporarily on a random sample of said plurality of source data, for the user to verify whether the objects and functionality of corresponding said plurality of target data is substantially similar to said plurality of source data.
7. The method as recited in claim 6 wherein said user can create a graphical view representing the relational or non-relational data before and after said migration based on said temporary migration process.
8. A data migration system for migrating data from relational and non-relational source data, independent of the source of generation to a target data system, comprising: a data mapper to evaluate characteristics of transformation of said plurality of source data a data translator to map the plurality of said source data type to an equivalent said plurality of target data type a data transformer to change an ODBC compliant said plurality of source data type to a non-ODBC compliant data type and vice-versa whereby said migration process involves no loss of said plurality of source data while migrating to corresponding said plurality of target data and retains original definition of said plurality of source data
9. The data migration system as recited in claim 10 includes a memory store comprising of an object definition and data repository of rules and conditions to define said objects and parameters for translation of said objects between the said plurality of source data and corresponding said plurality of target data.
10. The data migration system as recited in claim 11 wherein said migration is executed by associating and migrating said plurality of source data to said plurality of target data based on said memory store. "
11. The data migration system as recited in claim 12 wherein said memory store resides outside of said data migration application to facilitate the update of said rules and objects without any rewrite of said data migration application code.
12. The data migration system as recited in claim 10 is capable of creating a graphical view representing relational or non-relational data before and after said data migration.
13. The method of migrating a plurality of relational and non-relational source data, independent of the source of generation to a corresponding plurality of target data system, comprising the steps: extracting said data from multiple heterogeneous data sources irrespective of said data being in a relational or non-relational database form; transforming said extracted data to the required form of target data; cleansing said transformed data of errors existing in said source data but unwanted in said target data; and consolidating said cleansed data for migrating to the target data
PCT/IN2004/000026 2003-01-30 2004-01-29 System and method for data migration and conversion WO2004077215A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN121/MUM/2003 2003-01-30
IN121MU2003 2003-01-30

Publications (2)

Publication Number Publication Date
WO2004077215A2 true WO2004077215A2 (en) 2004-09-10
WO2004077215A3 WO2004077215A3 (en) 2005-05-26

Family

ID=32922935

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2004/000026 WO2004077215A2 (en) 2003-01-30 2004-01-29 System and method for data migration and conversion

Country Status (1)

Country Link
WO (1) WO2004077215A2 (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006040237A1 (en) * 2004-10-13 2006-04-20 Siemens Aktiengesellschaft Method for converting data
EP1857946A2 (en) 2006-05-16 2007-11-21 Sap Ag Systems and methods for migrating data
US20080005165A1 (en) * 2006-06-28 2008-01-03 Martin James A Configurable field definition document
EP2043013A1 (en) * 2007-09-19 2009-04-01 Accenture Global Services GmbH Data mapping document design system
EP2043012A1 (en) * 2007-09-19 2009-04-01 Accenture Global Services GmbH Data mapping design tool
US20120246060A1 (en) * 2011-03-25 2012-09-27 LoanHD, Inc. Loan management, real-time monitoring, analytics, and data refresh system and method
EP2589010A1 (en) * 2010-07-01 2013-05-08 Hewlett-Packard Development Company, L.P. Migrating artifacts between service-oriented architecture repositories
CN104899333A (en) * 2015-06-24 2015-09-09 浪潮(北京)电子信息产业有限公司 Cross-platform migrating method and system for Oracle database
GB2546110A (en) * 2016-01-11 2017-07-12 Healthera Ltd Data processing method and apparatus
US10162611B2 (en) 2016-01-04 2018-12-25 Syntel, Inc. Method and apparatus for business rule extraction
US10360190B2 (en) 2016-03-31 2019-07-23 Microsoft Technology Licensing, Llc Migrate data in a system using extensions
CN110688367A (en) * 2019-09-27 2020-01-14 浪潮软件集团有限公司 Universal database migration adaptation method and system
US10606573B2 (en) 2017-06-07 2020-03-31 Syntel, Inc. System and method for computer language migration using a re-architecture tool for decomposing a legacy system and recomposing a modernized system
CN112035461A (en) * 2020-06-17 2020-12-04 深圳市法本信息技术股份有限公司 Migration method and system for table data file of database
US10956381B2 (en) 2014-11-14 2021-03-23 Adp, Llc Data migration system
US10977565B2 (en) 2017-04-28 2021-04-13 At&T Intellectual Property I, L.P. Bridging heterogeneous domains with parallel transport and sparse coding for machine learning models
CN116340411A (en) * 2023-05-31 2023-06-27 物产中大数字科技有限公司 Data processing method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000065486A2 (en) * 1999-04-09 2000-11-02 Sandpiper Software, Inc. A method of mapping semantic context to enable interoperability among disparate sources
US6151608A (en) * 1998-04-07 2000-11-21 Crystallize, Inc. Method and system for migrating data
US6356901B1 (en) * 1998-12-16 2002-03-12 Microsoft Corporation Method and apparatus for import, transform and export of data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6151608A (en) * 1998-04-07 2000-11-21 Crystallize, Inc. Method and system for migrating data
US6356901B1 (en) * 1998-12-16 2002-03-12 Microsoft Corporation Method and apparatus for import, transform and export of data
WO2000065486A2 (en) * 1999-04-09 2000-11-02 Sandpiper Software, Inc. A method of mapping semantic context to enable interoperability among disparate sources

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006040237A1 (en) * 2004-10-13 2006-04-20 Siemens Aktiengesellschaft Method for converting data
EP1857946A2 (en) 2006-05-16 2007-11-21 Sap Ag Systems and methods for migrating data
EP1857946A3 (en) * 2006-05-16 2007-12-26 Sap Ag Systems and methods for migrating data
US8667382B2 (en) * 2006-06-28 2014-03-04 International Business Machines Corporation Configurable field definition document
US20080005165A1 (en) * 2006-06-28 2008-01-03 Martin James A Configurable field definition document
EP2043013A1 (en) * 2007-09-19 2009-04-01 Accenture Global Services GmbH Data mapping document design system
EP2043012A1 (en) * 2007-09-19 2009-04-01 Accenture Global Services GmbH Data mapping design tool
US7801884B2 (en) 2007-09-19 2010-09-21 Accenture Global Services Gmbh Data mapping document design system
US7801908B2 (en) 2007-09-19 2010-09-21 Accenture Global Services Gmbh Data mapping design tool
US9104668B2 (en) 2010-07-01 2015-08-11 Hewlett-Packard Development Company, L.P. Migrating artifacts between service-oriented architecture repositories
EP2589010A4 (en) * 2010-07-01 2014-04-30 Hewlett Packard Development Co Migrating artifacts between service-oriented architecture repositories
EP2589010A1 (en) * 2010-07-01 2013-05-08 Hewlett-Packard Development Company, L.P. Migrating artifacts between service-oriented architecture repositories
US20120246060A1 (en) * 2011-03-25 2012-09-27 LoanHD, Inc. Loan management, real-time monitoring, analytics, and data refresh system and method
US10956381B2 (en) 2014-11-14 2021-03-23 Adp, Llc Data migration system
CN104899333A (en) * 2015-06-24 2015-09-09 浪潮(北京)电子信息产业有限公司 Cross-platform migrating method and system for Oracle database
US10162611B2 (en) 2016-01-04 2018-12-25 Syntel, Inc. Method and apparatus for business rule extraction
US10162612B2 (en) 2016-01-04 2018-12-25 Syntel, Inc. Method and apparatus for inventory analysis
US10162610B2 (en) 2016-01-04 2018-12-25 Syntel, Inc. Method and apparatus for migration of application source code
GB2546110A (en) * 2016-01-11 2017-07-12 Healthera Ltd Data processing method and apparatus
US10360190B2 (en) 2016-03-31 2019-07-23 Microsoft Technology Licensing, Llc Migrate data in a system using extensions
US10977565B2 (en) 2017-04-28 2021-04-13 At&T Intellectual Property I, L.P. Bridging heterogeneous domains with parallel transport and sparse coding for machine learning models
US10606573B2 (en) 2017-06-07 2020-03-31 Syntel, Inc. System and method for computer language migration using a re-architecture tool for decomposing a legacy system and recomposing a modernized system
CN110688367A (en) * 2019-09-27 2020-01-14 浪潮软件集团有限公司 Universal database migration adaptation method and system
CN112035461A (en) * 2020-06-17 2020-12-04 深圳市法本信息技术股份有限公司 Migration method and system for table data file of database
CN116340411A (en) * 2023-05-31 2023-06-27 物产中大数字科技有限公司 Data processing method and device
CN116340411B (en) * 2023-05-31 2024-02-27 物产中大数字科技有限公司 Data processing method and device

Also Published As

Publication number Publication date
WO2004077215A3 (en) 2005-05-26

Similar Documents

Publication Publication Date Title
US6374252B1 (en) Modeling of object-oriented database structures, translation to relational database structures, and dynamic searches thereon
US8918447B2 (en) Methods, apparatus, systems and computer readable mediums for use in sharing information between entities
US7117215B1 (en) Method and apparatus for transporting data for data warehousing applications that incorporates analytic data interface
WO2004077215A2 (en) System and method for data migration and conversion
US7805341B2 (en) Extraction, transformation and loading designer module of a computerized financial system
US10095717B2 (en) Data archive vault in big data platform
EP0978061B1 (en) Object graph editing context and methods of use
US8010905B2 (en) Open model ingestion for master data management
US11893036B2 (en) Publishing to a data warehouse
US20050071359A1 (en) Method for automated database schema evolution
US20030208493A1 (en) Object relational database management system
US20040139070A1 (en) Method and apparatus for storing data as objects, constructing customized data retrieval and data processing requests, and performing householding queries
AU2002364538B2 (en) Database system having heterogeneous object types
US8639717B2 (en) Providing access to data with user defined table functions
US7840603B2 (en) Method and apparatus for database change management
US7139768B1 (en) OLE DB data access system with schema modification features
US20050114404A1 (en) Database table version upload
KR101820108B1 (en) A query processing system for 2-level queries by integrating cache tables
EP1374090A2 (en) Computer method and device for transporting data
US20050262070A1 (en) Method and apparatus for combining of information across multiple datasets in a JavaScript environment
Hamilton ADO. NET Cookbook
Watson Beginning C# 2005 databases
KR100505111B1 (en) The apparatus and method of creating program source for operating database and the computer program product using the same
US20050262156A1 (en) Method and apparatus for informational comparison of multiple datasets in a javascript environment
Crook Visual foxpro client-server handbook

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase
WWE Wipo information: entry into national phase

Ref document number: 449.06

Country of ref document: BZ