Search Process Overview
The search engine follows a structured eight-step process when handling a query:
Query Handling Process
-
Preprocessing the Search Term Before processing the search, the system prepares the term by:
- Removing special characters
- Normalizing terms
(e.g.,
"Levi's"→"levis","s.oliver"→"soliver") - Removing hyphens and thousand separators
- Converting negation words
(e.g.,
"without"→"-"prefix) - Removing superscript numbers/letters
- Removing stop words
(e.g.,
"the","and","also") - Replacing umlauts
- Converting to lowercase
- Splitting compound words
- Applying stemming and spelling correction
-
Checking for Direct Links If a predefined direct link exists for the query, the user is redirected to a:
- Content page
- Product detail page
- Specific internal or external URL If no link is found, the process continues.
-
Searching for Replacements The system checks for predefined replacements for the term. If found, they affect Step 7 (result merging). Regardless, search continues as normal.
-
Finding Synonyms The system checks for stored synonyms and adds them to the query. They are included with a small ranking penalty.
-
Finding Synonyms for Misspellings Spellchecker-corrected terms are also added as synonyms. A penalty is applied based on the distance between the original and the corrected word.
-
Searching the Search Index The search uses previously generated
wordListandcutWordList. Every result starts at 100 points. Points are deducted based on where the term appears.Data Field Penalty Description Brand 0 Product brand Short Description 0 Shown in result overview Target Group 0 Men, Women, Children Categories 4 Category placement Product Group 4 Internal classification Market Identifier (MKZ) 4 Internal marker Primary Color 8 Main product color Filter Values 8 Matched filterable attributes Dimensions 8 Size categories like “Short”, “Normal”, “Long” Long Description 40 Full text description -
Merging Results (if applicable) If replacements were found in Step 3, multiple result sets are merged and re-ranked.
-
Outputting the Results Before the final results are returned:
- The product ranking is finalized
- Filters are calculated
- Results are serialized as JSON The structure can be tenant-specific
- Short-term caching applies only for large result sets
Ranking of Variations
Many products have multiple variants (e.g., color, size). The system ranks these variations using:
- Rules from the Rule Engine
- A BI-based scoring system to prioritize the best variant
Only the top-ranked variant is shown in the results.
Additional Factors in Search Ranking
Availability (Stock Level)
| Stock Level | Penalty |
|---|---|
| ≥ 80% available | No penalty |
| 45% – 80% available | -10 points |
| < 45% available | -50 points |
| 0% (out of stock) | Excluded entirely |
Business Rule Value
- Adds 5% weight to the product ranking
- Values are normalized logarithmically
- Top product gets a 100% boost, others scale accordingly