Skip to content

[spark] support lateral inner join for vector search#8252

Open
Stefanietry wants to merge 1 commit into
apache:masterfrom
Stefanietry:support_lateral_join_for_vector_search
Open

[spark] support lateral inner join for vector search#8252
Stefanietry wants to merge 1 commit into
apache:masterfrom
Stefanietry:support_lateral_join_for_vector_search

Conversation

@Stefanietry

Copy link
Copy Markdown
Contributor

Purpose
Purpose: Support lateral join for vector search on spark.
Linked issue: #8251

Tests
Add vector search with lateral join on org.apache.paimon.spark.SparkMultimodalITCase#testVector、org.apache.spark.sql.test.SQLTestUtils#test("lateral vector search preserves subquery alias qualifiers")

@Stefanietry Stefanietry force-pushed the support_lateral_join_for_vector_search branch from 8aa9c09 to 774c9b6 Compare June 16, 2026 08:05

override protected def doExecute(): RDD[InternalRow] = {
child.execute().mapPartitions {
outerRows =>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can batch queries be supported? Batch queries are crucial for performance. You can take a look to benchmark in https://github.com/apache/paimon-vector-index

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your reminder. I'll refine it in batch mode later.

@JingsongLi

Copy link
Copy Markdown
Contributor

Please fix test failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants