calebb
[H]F Junkie
- Joined
- Mar 26, 2000
- Messages
- 8,758
This is somewhat of an academic question, but one that's been on my mind for a few months now. Actually, perhaps it's not academic and there really is a better way to do this - Anyway, I'll try to spare you as many boring details as I can, but feel free to ask for more background details.
I (and a database contractor, who also frequents the [H]) built a large e-commerce system over the summer. The items we sell are listed on many different marketplaces via whatever API the marketplace has provided. An early design decision was to have the order-import application take all the raw order report data and put it directly into a "raw" table in our database.
Then, a separate database stored procedure parses the order-raw data into the correct location based on a "mapping" table. For example, Amazon.com has a order report header named "order-id" - this maps to "MarketplaceOrderID" in our database which would go in the "ORDERS" table.
A quirky detail for most of the marketplaces we sell on: If a customer orders more than one item (different product id / ISBN), the marketplace actually returns all their order details (name, address, email address, MarketplaceOrderID) once for each unique item purchased. The only thing that makes each line unique is the ISBN (ASIN) of the product purchased and something that is generally named "Order-Item-ID." If a customer orders a qty > 1 for a specific item, that is returned on a single line.
Ok, we're getting closer to my question.
I try to follow good software design practices (n-tier in this case) and keep my code as reusable as possible. I also prefer to not have a single method that has many nested if/switch statements to maximize readability/maintainability. In most cases I can keep it to 1 or 2 levels max. But not in this case (so far).
So the two things that I'm specifically having issues with:
1. Structuring this method so that is is reusable between all marketplaces. Specifically, each marketplace has a few additional pieces of data manipulation that must happen.
.a. Some marketplaces do not use a standard (importable) datetime field. (in this case, I need to detect it via regex, pass it to the method to clean it up, and format it to a valid .NET datetime variable)
.b. Or, on a different marketplace, they don't include a shipping method ~10% of the time - so I need to set it to "standard" in the application. (I prefer not to put that logic in the database)
.c. On yet another marketplace, they sometimes accidentally duplicate a line in the order report file - so I need to check to make sure this order-item-id was not just processed.
.d. I have ~5 other marketplace-specific exceptions - so rather than drop all the data variation handling into a single common method (which really becomes unmaintainable), I have many different variations of this method - one for each marketplace. (new problem - spaghetti-ish code that still feels overly complicated)
2. Additionally, this is by far the ugliest section of code in the entire project (which includes 20+ different applications)... and remember, I omitted the code for handling the data variations I mentioned above - it's even messier than this in real life...
Semantically, this is all that the method needs to do:
1. Parse the order report headers by whatever delimiter necessary (tab, comma, pipe are the possibilities).
2. Iterate through the report line-by-line
.a. Make sure the number of fields matches the number of fields in the header
.b. Split by delimiter into an array
.c. Determine if this row represents is a new order in our database, a previously inserted order in our database or an additional row (order-item) for an order I'm currently importing.
.d. Check for (various marketplace specific details)
.e. Perform database insert.
Here's some code:
Can anyone offer some advice / critique on this?
Thanks in advance!
I (and a database contractor, who also frequents the [H]) built a large e-commerce system over the summer. The items we sell are listed on many different marketplaces via whatever API the marketplace has provided. An early design decision was to have the order-import application take all the raw order report data and put it directly into a "raw" table in our database.
Then, a separate database stored procedure parses the order-raw data into the correct location based on a "mapping" table. For example, Amazon.com has a order report header named "order-id" - this maps to "MarketplaceOrderID" in our database which would go in the "ORDERS" table.
A quirky detail for most of the marketplaces we sell on: If a customer orders more than one item (different product id / ISBN), the marketplace actually returns all their order details (name, address, email address, MarketplaceOrderID) once for each unique item purchased. The only thing that makes each line unique is the ISBN (ASIN) of the product purchased and something that is generally named "Order-Item-ID." If a customer orders a qty > 1 for a specific item, that is returned on a single line.
Ok, we're getting closer to my question.
I try to follow good software design practices (n-tier in this case) and keep my code as reusable as possible. I also prefer to not have a single method that has many nested if/switch statements to maximize readability/maintainability. In most cases I can keep it to 1 or 2 levels max. But not in this case (so far).
So the two things that I'm specifically having issues with:
1. Structuring this method so that is is reusable between all marketplaces. Specifically, each marketplace has a few additional pieces of data manipulation that must happen.
.a. Some marketplaces do not use a standard (importable) datetime field. (in this case, I need to detect it via regex, pass it to the method to clean it up, and format it to a valid .NET datetime variable)
.b. Or, on a different marketplace, they don't include a shipping method ~10% of the time - so I need to set it to "standard" in the application. (I prefer not to put that logic in the database)
.c. On yet another marketplace, they sometimes accidentally duplicate a line in the order report file - so I need to check to make sure this order-item-id was not just processed.
.d. I have ~5 other marketplace-specific exceptions - so rather than drop all the data variation handling into a single common method (which really becomes unmaintainable), I have many different variations of this method - one for each marketplace. (new problem - spaghetti-ish code that still feels overly complicated)
2. Additionally, this is by far the ugliest section of code in the entire project (which includes 20+ different applications)... and remember, I omitted the code for handling the data variations I mentioned above - it's even messier than this in real life...
Semantically, this is all that the method needs to do:
1. Parse the order report headers by whatever delimiter necessary (tab, comma, pipe are the possibilities).
2. Iterate through the report line-by-line
.a. Make sure the number of fields matches the number of fields in the header
.b. Split by delimiter into an array
.c. Determine if this row represents is a new order in our database, a previously inserted order in our database or an additional row (order-item) for an order I'm currently importing.
.d. Check for (various marketplace specific details)
.e. Perform database insert.
Here's some code:
Code:
String rawOrderData = this.GetMarketplaceOrderReport(wr); //the webrequest
if (strResult == string.Empty)
{
m_validate.SendDebugEmail("[TBv2][SEV2][Miner] Error getting report id " + intReportID.ToString(), "See Subject");
return false;
}
// This method has its own error handling - throws differnet exceptions when errors encountered
String rawOrderData = this.GetMarketplaceOrderReport(wr);
String stringDelimiter = "\t"; //This is actually passed into this method
String[] arrayOrderDataRows = strResult.Split('\n');
String[] arrayHeadersDataRow = arrayOrderDataRows[0].Split('\t');
String stringLastOrderID = String.Empty; //To know when to request new orderid
Int64 intIdOrder = 0;
Int16 intArrayOffsetForMarketplaceOrderID = 1; //This is passed into this method - self descriptive
Int16 intArrayOffsetForOrderItemQuantity = 10; //This is the offset in arrayOrderDataRow for the qty of this item
bool boolNewOrder = true; //Determines if this is a new order, or another line in an existing order
int intIdRow = 0; //Internal counter to assist the store procedure to know when we're on a new line
//start at 1 to skip first (headers) row
for (int i = 1; i < arrayOrderDataRows.Length; i++)
{
String[] arrayOrderDataRow = arrayDataRows[i].Split(stringDelimiter);
if (arrayOrderDataRow.Length != arrayHeadersDataRow.Length)
{
//Occasionally a marketplace sends a malformed order report
//sends a debug email and logs error to database
}
else
{
//Only consider getting a new Order ID if this is a new order
if (strLastOrderID != arrayOrderDataRow[intArrayOffsetForMarketplaceOrderID])
{
//I say /consider/ because sometimes marketplaces send us a order a second time
//I always run a check against the database to see if the marketplace-order-id is new or
//...previously imported.
boolNewOrder = CheckIfNewMarketplaceOrderId(arrayOrderDataRow[1].ToString()); //returns FALSE if this order was previously imported
if (boolNewOrder)
{
//Set to true so that this row in the order report will be imported.
boolNewOrder = true;
//Also get a new internal orderid from the ID Broker.
intIdOrder = this.GetIdOrder();
}
}
//This is not the /LastOrderID/ yet... next time through the loop it will be
strLastOrderID = arrayOrderDataRow[1].ToString();
if (boolNewOrder) //If true, we know we must process the current arrayOrderDataRow
{
for (int j = 0; j < Convert.ToInt16(arrayOrderDataRow[intArrayOffsetForOrderItemQuantity]); j++) //iterator for buying multiple copies of the same item
{
intIdRow++; // You can ignore this - this is an internal counter that is used by the stored procedure to know we are on a new line
for (int k = 0; k < arrayOrderDataRow.Length; k++)
{
if (arrayOrderDataRow[k] != string.Empty) //No need to insert NULLS into the database
{
// I snipped out all the marketplace specific code here
// ...it's a lot of code...
try
{
this.AddRawOrderField(intIdOrder, intIdRow, arrHeaders[k], arrayOrderDataRow[k]);
}
catch(Exception ex)
{
throw new ExErrorAddingRawOrderField(m_intIdMarketplace, intIdOrder, ex)
}
}
}
}
}
}
}
Can anyone offer some advice / critique on this?
Thanks in advance!