Join with uniouned returns 1899 bad distributed join plan

I have events that are inserted to table game_events and other events that are inserted to table rejected_events.

I want to union them and then do bunch of joins with other data of other tables.

I run this query directly on the memSQL console:

SELECT combinedEvents.*, win.*
 FROM  (
  SELECT event_id, action, status
 FROM events.game_events
 WHERE event_arrival_time BETWEEN '2019-12-17T00:00:00Z' AND '2019-12-30T23:59:59.999Z' 
 UNION  
 SELECT event_id, action,status
 FROM rejected.events
 WHERE event_arrival_time BETWEEN '2019-12-17T00:00:00Z' AND '2019-12-30T23:59:59.999Z' 
 ) AS combinedEvents  
 LEFT JOIN winner.winner_data AS win ON combinedEvents.event_id = win.event_id AND win.status = 'ACCEPTED' 

I get from mem:

Error Code: 1889. Bad distributed join plan: leaf select contains sharded tables of multiple databases. Please contact technical support.

If I remove the JOIN - It works properly.

Any ideas?

Thanks and best regards, Ido

I was able to reproduce your issue with

drop database if exists events;
drop database if exists rejected;
drop database if exists winner;
create database events;
create database rejected;
create database winner;

use events;
create table game_events(event_id int, action int , status int, event_arrival_time timestamp);

use rejected;
create table events(event_id int, action int , status int, event_arrival_time timestamp);

use winner;
create table winner_data(event_id int);

SELECT combinedEvents.*, win.*
 FROM  (
  SELECT event_id, action, status
 FROM events.game_events
 WHERE event_arrival_time BETWEEN '2019-12-17T00:00:00Z' AND '2019-12-30T23:59:59.999Z' 
 UNION  
 SELECT event_id, action,status
 FROM rejected.events
 WHERE event_arrival_time BETWEEN '2019-12-17T00:00:00Z' AND '2019-12-30T23:59:59.999Z' 
 ) AS combinedEvents  
 LEFT JOIN winner.winner_data AS win ON combinedEvents.event_id = win.event_id

The error is being thrown on the UNION on the left side of the join. Changing the table rejected.events to events.rejected_events solved the issue for me locally.

drop database if exists events;
drop database if exists winner;
create database events;
create database winner;

use events;
create table game_events(event_id int, action int , status int, event_arrival_time timestamp);

create table rejected_events(event_id int, action int , status int, event_arrival_time timestamp);

use winner;
create table winner_data(event_id int);

use events;

SELECT combinedEvents.*, win.*
 FROM  (
  SELECT event_id, action, status
 FROM events.game_events
 WHERE event_arrival_time BETWEEN '2019-12-17T00:00:00Z' AND '2019-12-30T23:59:59.999Z' 
 UNION  
 SELECT event_id, action,status
 FROM rejected_events
 WHERE event_arrival_time BETWEEN '2019-12-17T00:00:00Z' AND '2019-12-30T23:59:59.999Z' 
 ) AS combinedEvents  
 LEFT JOIN winner.winner_data AS win ON combinedEvents.event_id = win.event_id

Alternatively, without changing the schema, joining each select in the UNION separately also ran for me:

(SELECT combinedEvents.*, win.*
 FROM  (
  SELECT event_id, action, status
 FROM events.game_events
 WHERE event_arrival_time BETWEEN '2019-12-17T00:00:00Z' AND '2019-12-30T23:59:59.999Z' 
 ) AS combinedEvents  
 LEFT JOIN winner.winner_data AS win ON combinedEvents.event_id = win.event_id)
UNION ALL
(SELECT combinedEvents.*, win.*
 FROM  (
 SELECT event_id, action,status
 FROM rejected.events
 WHERE event_arrival_time BETWEEN '2019-12-17T00:00:00Z' AND '2019-12-30T23:59:59.999Z' 
 ) AS combinedEvents  
 LEFT JOIN winner.winner_data AS win ON combinedEvents.event_id = win.event_id);

In my repro, all tables are empty. Pretty sure swapping the order of the JOIN and UNION creates an equivalent query in the case, but I didn’t test that claim :slight_smile:

Hope this helps!